Input/Output Guardrails | designpattern.fyi

Back to Catalog

Advantages

Single chokepoint catches both injection attempts and accidental Agent Confession outputs before they reach users.
Centralised audit trail of blocked inputs and redacted outputs, queryable by refusal type.
Output echo detection provides a safety net even when model-level prompt confidentiality fails.

Disadvantages

False positives on the Agent Confession input filter may block legitimate questions about AI system design.
The echo detector requires access to system-prompt content at runtime — a secret must be shared with the guardrail layer.
Validator stack drifts from current threats; creative rephrasing of Agent Confession triggers requires continuous red-teaming to keep detectors current.