Vector Memory
Store memories as embeddings in a vector index and retrieve the most semantically similar items at query time — so relevance is judged by meaning, not keyword match or recency.
Intent & Description
🎯 Intent
Store memories as embeddings in a vector index and retrieve the most semantically similar items at query time.
📋 Context
A long-running agent accumulates facts and observations over time. On each step it needs to find the small subset of past items most relevant to the current situation. Relevance is best judged by semantic similarity, not exact term match or chronological recency — “find past notes whose meaning is closest to what’s happening now.”
💡 Solution
Embed and index each memory item. At query time, embed the query (or a summary of current state), retrieve the top-k most similar memories, and prepend to context. Optionally apply decay (boost recent, age old) and salience weighting.
Real-world Use Case
- A long-running agent accumulates facts whose relevance is best judged by semantic similarity.
- An append-only log would otherwise grow unboundedly without selective retrieval.
- An embedding model and vector index can be deployed and maintained.
Source
📌 TL;DR
Embed every memory and retrieve top-k by semantic similarity — relevance by meaning beats keyword matching and chronological recency for most long-running agent tasks.
Advantages
- Semantically relevant past surfaces automatically — no explicit query planning needed.
- Scales to memory stores far too large to fit in context.
Disadvantages
- Misses purely temporal queries (“what did I do yesterday?”) — vector similarity doesn’t capture chronology.
- Embedding drift on model or schema changes can silently degrade retrieval quality.