Trajectory Anomaly Monitor
Run a trained, non-LLM verifier out-of-band over the agent's action trajectory at runtime to flag task-misaligned plans and malformed step sequence...
Intent & Description
🎯 Intent
Run a trained, non-LLM verifier out-of-band over the agent’s action trajectory at runtime to flag task-misaligned plans and malformed step sequences at millisecond latency, before the actions cause damage.
📋 Context
An autonomous agent takes real actions in sequence — tool calls, plan steps, state changes — where a misaligned or malformed trajectory can cause damage. The team wants a runtime safety check on every step, but an LLM judge on each action is too slow and too expensive to sit in the hot path, and output-quality scoring after the fact arrives only once the action has already happened.
💡 Solution
Train a dedicated verifier — a sequence model or a process-supervised classifier, not an LLM judge — on agent trajectories labelled for task alignment and structural validity. At runtime it consumes the agent’s action sequence out-of-band and emits an anomaly signal at millisecond latency, fast enough to gate or pause the agent before the next action executes. Reported results put such a verifier at tens of milliseconds per check, well over an order of magnitude faster than an LLM-judge baseline, with process supervision over the trajectory outperforming output-only checks. Compose with a policy gate that halts or escalates on a flagged trajectory, and reserve LLM-judge review for the flagged cases rather than every step. Distinct from scoring final outputs and from loop-shape heuristics: the unit is the whole action sequence, and the timing is pre-damage.
Real-world Use Case
- An agent takes consequential actions in sequence where a misaligned trajectory can cause damage.
- Per-step LLM-judge oversight is too slow or costly for the production hot path.
- Enough labelled trajectory data exists to train and maintain a verifier.
Source
Advantages
- Real-time safety verification on every step without the latency or cost of an LLM judge in the hot path.
- Sequence-aware detection catches plan drift and malformed step structure that output scoring misses.
- Cheap enough to run always-on, so flagged trajectories can be gated before the next action.
Disadvantages
- A trained verifier must be built, supervised with labelled trajectories, and maintained as the agent changes.
- It detects anomalies it was trained to recognise; novel misalignment outside the training distribution can slip through.
- A miscalibrated monitor either gates good trajectories (false positives) or misses bad ones (false negatives).