Test-Time Compute Scaling

Compute cost scales up fast — best-of-N at N=32 is 32x the token cost.
Requires a verifier or reward model to select among candidates — adds system complexity.
Latency increases are significant for sequential strategies (chain refinement, deep search).