Tool Explosion

Tool Explosion degrades model decision-making by flooding the tool list with options the agent doesn’t need for the current task.

Intent & Description

🎯 Intent

Defaulting to “expose all tools” because registration is cheap — then discovering that selection accuracy drops measurably as the tool list grows.

📋 Context

MCP servers, plugin ecosystems, and tool registries make it trivial to expose dozens or hundreds of tools. Teams expose everything “so the agent can reach for anything.” Past ~20 tools, function-calling accuracy measurably degrades — and the token cost of carrying large tool definitions in every prompt compounds the problem.

💡 Solution

Use a tool-loadout: curate the relevant subset per task type. Cap exposed tools at a tested threshold. Measure function-calling accuracy as a release gate.

Real-world Use Case

Never use this; past about 20 tools, function-calling accuracy drops sharply.
Use tool-loadout to select per-task subsets and cap exposed tools at a tested threshold.
Measure function-calling accuracy as a release gate.

Source

View Original Source →

📌 TL;DR

Cap exposed tools per request to a tested threshold — quality drops measurably past ~20 tools.