Memory management, in the context of AI agents, is the set of strategies a system uses to store, retrieve, summarise, and expire information across interactions. Because an agent accumulates far more history than can fit in a model’s context window, it must actively decide what to keep, what to compress, and what to surface at any given moment.
Good memory management balances competing pressures. Storing everything is cheap to write but makes retrieval noisy and expensive; storing too little loses important context. So agents employ techniques like summarising old conversations into compact notes, ranking memories by recency and importance, deduplicating redundant entries, and choosing how many memories to retrieve for a given task. The vector database provides the retrieval substrate, but the policies around it are what make memory useful.
This is why a vector database alone is not a complete memory system. It excels at finding semantically relevant items, but effective memory management layers on the logic for what to store, when to forget, how to compress, and how to prioritise — turning raw similarity search into a coherent memory that helps the agent behave consistently over time.