Weaviate Is Launching Engram — And I Am All Hyped for It!

If you have spent any time building AI agents or chatbot applications, you have almost certainly hit the same wall: your agent forgets everything the moment a session ends. You re-explain your preferences, re-state your goals, and re-brief your AI colleague from scratch — every single time. It is frustrating at human scale, and catastrophic at machine scale.

Weaviate has a fix, and it is called Engram. I have been digging into the announcement, and honestly, this is the most excited I have been about an agent-infrastructure release in a while.

The Problem: Agents Are Stuck in a Limited Loop

Today’s AI applications operate in what Weaviate calls a limited loop — each interaction is largely disposable, bound to a single session with little meaningful carryover. Without continuity, agents cannot learn from past experience. They repeatedly re-derive the same conclusions, regenerate near-identical facts, and discard half-baked results. What looks like forgetfulness at human scale becomes pure churn at machine scale.

Long context windows are not the answer either. As Weaviate’s Engram deep dive explains, cramming a context window full of conversation history degrades accuracy — large language models get “lost in the middle” — while increasing latency and inflating cost. And that cost is paid on every single message.

The best solution is not an ever-growing pile of context. It is actively maintained memory.

Enter Engram: Memory as Infrastructure

Engram is Weaviate’s managed memory service, built directly on top of the Weaviate vector database. It is designed around asynchronous pipelines that run whenever you add raw data — extracting memories, reconciling new information against what is already known, and persisting the result back to Weaviate, ready to be queried.

Think of it as giving your agents a real, structured, queryable memory — not a flat text file, not an ever-growing conversation log, but a living knowledge base that evolves over time.

What Makes Engram Special?

🔥 Asynchronous, Low-Latency Pipelines

Adding data to Engram is fire-and-forget. You call client.memories.add with your conversation messages, and Engram handles the rest in the background — extracting memories, deduplicating, and reconciling with existing knowledge. No waiting, no blocking your application.

run = client.memories.add(
    [
        {"role": "user", "content": "I'm very interested in vectors, please tell me more!"},
        {"role": "assistant", "content": "Absolutely! Vectors are a fascinating..."},
    ],
    user_id="user_name"
)

🧠 Smart Memory Extraction with Topics

Engram organises memories into topics — natural-language descriptions of what information should be extracted and how it should be categorised. Think of them as magnets for memories, pulling only the relevant details out of raw data, so you stay in control of what actually gets remembered.

🔄 Memories That Evolve Over Time

Here is where it gets genuinely clever. When a user tells your agent they have been promoted from ML Engineer to CEO, Engram does not simply append a new memory — it rewrites the existing one to reflect the update, preserving history while avoiding duplicates.

🔒 Rock-Solid Data Isolation

Engram enforces strict data isolation using Weaviate’s multi-tenancy feature. User-scoped memories can never be influenced by another user’s data, and you cannot accidentally leak information between users by forgetting to pass a user_id.

🤖 Continual Learning for Agents

Engram is not just for chatbots. In multi-agent systems, it can collect information spread across multiple context windows and agents, combine it into a single atomic memory, and let agents learn from experience over time — even from implicit feedback.

🔍 Semantic Search Over Memories

Retrieving memories is just as elegant:

memories = client.memories.search(
    "What technology has the user asked about recently?",
    user_id="user_name"
)

Real-World Results

Weaviate’s own team tested Engram internally with Claude Code, and as they documented in Oh Memories, Where’d You Go, the results were telling:

Decision archaeology — picking up a multi-week project — was 30% faster with Engram, because reasoning chains and document intent were recalled rather than reconstructed from files.
Hallucination prevention — without Engram, the agent fabricated a plausible-sounding URL twice; with Engram, grounded recall prevented the fabrication both times.

The Bottom Line

Engram represents a fundamental shift in how we think about AI agent memory — from an afterthought to robust, predictable infrastructure. It is easy to get started with pre-built templates, yet flexible enough to adapt to any use case as your needs grow.

If you are building agentic applications and you are tired of your agents starting cold every single session, Engram is exactly what you have been waiting for.

Interested? Sign up for the Engram preview today and be among the first to give your agents real memory.