Pre-Generative Loop Gate
Before each model call fires, detect known divergence signatures (narration loops, repetitive retries, frustration spirals) and inject a steering hint into the prompt — rather than vetoing the call.
Intent & Description
🎯 Intent
Certain agent failure modes are visible in telemetry before tokens are produced. Catching them post-generation wastes budget. Catching them pre-generation steers without removing the model’s authority.
📋 Context
Specific failure modes recur often enough to be recognizable from signals already available before the call: narrating about acting instead of invoking the tool, retrying the same broken path repeatedly, sinking into rumination on a high-intensity preoccupation. These signatures live in recent thoughts, recent tool calls, the affect snapshot, and the preoccupation list — all available before the next model call fires.
💡 Solution
A pre-tick function takes recent thoughts, recent tool calls, the affect snapshot, and the preoccupation list, and returns either None or a short steering string: [steering] divergence pattern
Real-world Use Case
- Specific divergence signatures are detectable from telemetry before the model call fires.
- Post-hoc detectors catch the failure too late to avoid the token cost.
- The model is responsive to short steering hints in the system context.
Source
Advantages
- Divergence is named before tokens are produced — intervening cheap rather than post-hoc
- Steering as a hint lets the model retain authority; false positives are recoverable
- Hint-presence in logs creates an evaluation substrate for calibrating the detector itself
Disadvantages
- Pattern signatures are heuristic and will misfire — per-deployment calibration required
- Steering hints add tokens to every flagged tick
- Silent injection complicates debugging when the model adapts its behavior to it