Self-Ask
The model asks itself follow-up questions until it can answer the original.
Intent & Description
🎯 Intent
Have the model identify what it needs to know to answer a question, ask those sub-questions, answer them (via retrieval or generation), and compose the final answer from resolved sub-answers.
📋 Context
Multi-hop questions require chaining facts that aren’t co-located in the model’s weights or retrieved context. Asking the full question cold forces the model to guess the chain. Self-Ask makes the chain explicit and checkable.
💡 Solution
Prompt with a format like: “Are follow-up questions needed? [Yes/No]. Follow-up: [sub-question]. Intermediate answer: [answer]. … Final answer: [answer].” The model self-generates the question-answer chain. Optionally intercept “Follow-up:” lines and route them to a search tool or retrieval system for grounded answers. See also: ReAct, least-to-most-prompting, chain-of-thought.
Real-world Use Case
- Multi-hop QA (who is the CEO of the company that makes X?).
- Research tasks where the model needs to gather sub-facts before synthesizing.
- Any pipeline where combining self-ask with retrieval dramatically improves grounding.
Source
📌 TL;DR
Before answering, ask yourself what you need to know — then answer those questions first.
Advantages
- Makes reasoning gaps explicit — you can see exactly what the model doesn’t know.
- Sub-questions are natural retrieval queries — easy to hook into search tools.
- Outperforms standard CoT on multi-hop benchmarks.
Disadvantages
- Self-generated sub-questions can be wrong or irrelevant — garbage in, garbage out.
- Adds latency proportional to the number of follow-ups generated.
- Without retrieval, the model just answers its own questions from weights — limited grounding.