Errors Swept Under the Rug
Scrubbing stack traces and failed tool outputs from the agent's context to keep the transcript clean — and breaking its ability to self-correct.
Intent & Description
🎯 Intent
Silently retrying or discarding failed tool results so the agent’’s running trace looks clean — leaving it no evidence of what went wrong.
📋 Context
Tool failures (HTTP 500s, non-zero exits, rejected API calls) get replaced with a “retrying…” placeholder or just dropped. The intent is usually token economy plus clean transcripts. The result is an agent that keeps making the same mistake because it has no memory of the failure.
💡 Solution
Treat failure observations as load-bearing context — not noise to clean up. Preserve stack traces, tool-error returns, and rejection messages in the agent’’s running transcript. Compress only at run boundaries, never mid-loop. See decision-log, provenance-ledger.
Real-world Use Case
- Never. Hiding errors removes the signal the model needs to adapt.
- Read this entry as a warning, then preserve failure observations in the running context.
- Compress only at run boundaries — never mid-loop.
Source
📌 TL;DR
Preserve all error signals in the agent’s context — failures are load-bearing information, not noise to clean up.
Disadvantages
- Agent repeats the same failed action because it has no evidence the failure happened
- Loop-detection heuristics misfire — the surface trace looks like progress when it isn’’t
- Post-incident replay can’’t distinguish a clean run from a salvaged one