HyDE
Have the LLM write a hypothetical answer document, embed it, and use it as the retrieval query.
Intent & Description
🎯 Intent
Have the LLM write a hypothetical answer document, embed it, and use it as the retrieval query.
📋 Context
A team is using dense vector retrieval to find documents that match user queries, but the queries are short and underspecified — often a few words — while the passages in the corpus are long, well-formed, and written in a different style. The team also does not have labelled query-document relevance pairs that would let them train a query encoder to bridge the asymmetry.
💡 Solution
On query: prompt the LLM to draft a hypothetical answer to the query. Embed the hypothetical answer. Retrieve top-k by similarity to that embedding (not the original query). Pass the retrieved chunks into normal RAG.
Real-world Use Case
- Short user queries underperform on dense retrieval against long documents.
- An LLM call to draft a hypothetical answer fits the latency and cost budget.
- Recall on the first stage of RAG is the current bottleneck.
Source
Advantages
- Zero-shot improvement; no encoder fine-tuning.
- Particularly strong on short, underspecified queries.
Disadvantages
- Off-topic hallucinations cause retrieval drift.
- One extra LLM call per query.