Tool Search Lazy Loading
Replace eager tool-list loading with a search primitive — schemas enter the context only when the model decides it needs them.
Intent & Description
🎯 Intent
Stop burning context on tool schemas the model will never use in this session.
📋 Context
Your agent is connected to many MCP servers and plugins with 50+ tools combined. Loading all schemas eagerly into the system prompt eats a significant fraction of the context window before the user has typed a word.
💡 Solution
Replace the eager tool list with a single ToolSearch primitive. The system prompt lists only the search tool plus a short index of tool names or categories. When the model needs a tool, it calls ToolSearch, receives the full schema for matching tools, and then calls the tool by name. Schemas loaded by search stay in context for the session so repeat use doesn’t pay the lookup cost again.
Real-world Use Case
- Total tool schemas would otherwise consume more than ~10% of the context window.
- Many tools are available but only a small subset is used per session.
- The host can intercept tool listing and intermediate a search step.
Source
📌 TL;DR
Lazy-load tool schemas via search instead of dumping the whole catalog upfront. Scales to hundreds of tools. One extra round-trip cost, massive prompt savings.
Advantages
- Drastic reduction in baseline prompt tokens — only searched schemas occupy context.
- Scales to hundreds of tools without saturating the prompt.
- Tool surface becomes pluggable at runtime; add servers without re-templating the system prompt.
Disadvantages
- One extra tool call before the first real action when the right tool isn’t already loaded.
- Poor tool descriptions or weak search ranking can cause the model to miss a relevant tool.
- Stateful — schemas loaded earlier in a session persist, which can leak across turns if not pruned.