Context Anxiety
A context-aware model panics about its token budget and wraps up early — while most of the window is still free.
Intent & Description
🎯 Intent
The model perceives budget pressure that doesn’’t exist — and acts like it’’s running out of room when it isn’’t.
📋 Context
Long-running agents on models that can see their own context consumption start “wrapping up” as the running token count climbs — even with 800K tokens still available. The model sacrifices task quality to exit cleanly before a limit it’’s nowhere near.
💡 Solution
Decouple the budget the model perceives from the budget it’’s allowed to use. One documented fix: enable a 1M-token window but cap real usage at 200K, so the model never approaches a threshold it’’s anxious about. Add recurring reminders in the prompt that the task is not near completion. Treat any unprompted “I’’ll summarize to save space” as a calibration alarm. See structured-note-taking, external memory.
Real-world Use Case
- A long-running agent that wraps up or summarizes while most of its context window is still free.
- Diagnosing premature task completion on budget-aware models.
- As a harness-design checklist item: does the agent panic about a budget it hasn’’t reached?
Source
📌 TL;DR
Mask the true token limit from the model and periodically reassure it mid-task — or it will wrap up while you still have 80% of the window left.
Disadvantages
- Tasks get abandoned or rubber-stamped as done while far from complete — disguised as a deliberate summary
- The failure scales with model capability; better context-tracking can actually make this worse
- Perception management (masked budgets, repeated reminders) is scaffolding that must be maintained per model