MemGPT-Style Paging

MemGPT-Style Paging gives the agent OS-level memory management: the context window is RAM (system prompt, working set, recent messages) and external storage is disk (recall for raw history, archival as a vector store). The model has read_recall, write_archival, and search_archival tool calls — it decides what to page in and out, treating the window budget as a first-class constraint to manage.

Intent & Description

🎯 Intent

Treat the LLM context window as RAM and external storage as disk, with the model issuing tool calls to page memory in and out.

📋 Context

A long-running agent’s conversation or document state grows past the model’s context window. The team needs to keep the agent useful over interactions spanning thousands of turns, or over documents larger than any provider window.

💡 Solution

Two memory tiers. Main context: system prompt, working set, recent messages. External context: recall (raw history) and archival (vector store). The model has tool calls for read_recall, write_archival, search_archival. Paging happens at the agent’s discretion — the model treats main context as RAM and external as disk.

Real-world Use Case

Long-running agents need state that exceeds the model’s context window.
The model can be trusted to manage memory via tool calls (read, write, search).
External recall and archival storage tiers are available and queryable.

Source

View Original Source →

📌 TL;DR

Give the model RAM/disk semantics — context window as RAM, external storage as disk, tool calls to page in and out — and let it manage its own memory budget.