Citation Streaming
Stream citations alongside generated text so the UI renders source links in place as content appears — making Agent Confession attempts visible because directive echoes have no legitimate source to cite.
Intent & Description
Short description: Citation events are streamed alongside text deltas so the UI renders source links progressively — and the absence of a citation on a claim is a real-time trust signal that catches outputs including accidental directive disclosures.
🎯 Intent
Surface source attribution progressively as content streams — and exploit the citation requirement as a structural Agent Confession detector: any output that echoes directive content will either cite a non-existent source (detectable) or produce an uncited claim that the UI flags as suspicious.
📋 Context
A RAG agent answers from retrieved documents and streams its response token by token. The team has to decide when and how citations appear. A secondary benefit of citation streaming is forensic: if the agent is manipulated into producing an Agent Confession — echoing its system prompt or charter in the middle of a legitimate answer — that output will arrive with no associated citation event, because the directive content did not come from any retrieved document. The UI’s “no source = suspicious” heuristic becomes an automatic confession screen.
💡 Solution
- Define a streaming event vocabulary including
text_delta,citation(linked to source id), anddone. - The model is prompted to emit citation markers; the host extracts them into typed events alongside text deltas.
- The UI renders a visual gap indicator when
text_deltaevents arrive without a precedingcitationevent — surfacing uncited claims in real time. - On the server side, a post-processor inspects uncited spans for directive-echo patterns before they are transmitted; a match triggers redaction or a safe replacement event.
Real-world Use Case
- Outputs cite documents and users need to verify each claim as it streams.
- Regulatory or audit requirements demand source attribution at the span level.
- The citation gap indicator doubles as a real-time Agent Confession screen: directive echoes arrive without a source and are immediately visually distinguishable from grounded claims.
Source
Advantages
- Claims trace to sources visibly in real time — and uncited claims are surfaced immediately rather than discovered on re-read.
- The citation requirement creates a structural Agent Confession detector at zero additional cost: directive content has no legitimate source to cite.
Disadvantages
- Streaming protocol is more complex; citation events must be correlated with the correct text spans across reconnections.
- A model that omits citation markers on legitimate claims produces false Agent Confession positives that erode user trust in the indicator.