Large Reasoning Model (LRM) Paradigm
Use models trained to reason, not just predict — they're a different tool.
Intent & Description
🎯 Intent
Recognize that reasoning-specialized models (o1, o3, Claude with extended thinking, DeepSeek-R1) operate differently from standard chat models and should be treated as a separate tool class with different prompting strategies, cost profiles, and use-case fits.
📋 Context
Developers often apply the same prompting patterns to reasoning models that work on chat models — adding verbose CoT instructions, few-shot examples, or step-by-step directives. This actively hurts performance on LRMs, which are trained to reason internally and don’t benefit from external scaffolding.
💡 Solution
With LRMs — (1) keep system prompts minimal and direct, (2) don’t add CoT instructions — the model already does it, (3) set a thinking budget appropriate to task difficulty, (4) expect higher latency and cost, (5) use for tasks where accuracy matters more than speed. Pair with adaptive-compute-allocation to route only hard tasks to LRMs. See also: extended-thinking, adaptive-compute-allocation, test-time-compute-scaling.
Real-world Use Case
- Hard reasoning tasks: theorem proving, complex code, multi-step planning, adversarial QA.
- Any task where your chat model keeps getting wrong answers despite good prompting.
- Pipelines where a slow, expensive, accurate step is preferable to a fast, cheap, wrong one.
Source
📌 TL;DR
Reasoning models aren’t chat models with extra steps — use them differently, and only when you need them.
Advantages
- Native reasoning capability — not a prompt hack, it’s in the weights.
- Dramatically better on hard benchmarks vs. comparable-size chat models.
- Thinking budget gives you direct control over the accuracy/cost tradeoff.
Disadvantages
- Expensive and slow — wrong choice for simple tasks or latency-sensitive paths.
- Standard prompting intuitions often don’t apply — requires re-learning prompting habits.
- Thinking traces can be opaque and hard to debug when the model goes wrong.