Code-as-Action Agent
Replace JSON tool calls with Python snippets the agent emits and a sandbox executes — composing multiple tools in one shot with loops, filters, and conditionals.
Intent & Description
🎯 Intent
Use code as the agent’s action language so tool composition becomes function nesting, not chained JSON calls.
📋 Context
Your agent frequently needs to fetch a list, filter it, then call a second tool for each result. Expressing that as separate JSON tool calls is clunky and expensive. The model is good at writing short Python, and you have a sandbox.
💡 Solution
Replace the JSON tool-call channel with a code-snippet channel. The agent emits Python (or a DSL); the sandbox executes it with available tools pre-imported as functions and a safe builtins allowlist. Tool results are Python values usable in the same snippet. Multi-step composition — loops, conditionals, intermediate variables — happens inside one snippet. Every snippet runs inside a sandbox that whitelists imports and blocks arbitrary IO.
Real-world Use Case
- Tool composition is natural in code (filter, map, conditional chains) and clumsy as JSON calls.
- A sandboxed interpreter with pre-imported tools and a safe builtins allowlist is feasible.
- Saving turns by composing multiple operations per snippet would meaningfully cut token cost.
Source
📌 TL;DR
Swap JSON tool calls for Python snippets. The agent composes tools in code — loops, filters, conditionals — all in one round trip instead of many.
Advantages
- ~30% fewer steps and tokens than JSON tool calls — empirically measured.
- Natural composability: function nesting, loops, conditionals in one action.
- Modern frontier models emit better code than JSON when given the choice.
Disadvantages
- Sandbox correctness is load-bearing — a weak sandbox means arbitrary code execution.
- Debugging silent failures inside snippets is harder than per-call JSON tracing.
- Some hosted environments flat-out forbid model-generated code execution.