InfoCapability

Structured CodeAgents: JSON thoughts and code for reliable tool calls via OpenAI function calling API

AI Impact Summary

Forcing CodeAgents to emit a structured JSON blob with explicit thoughts and executable Python code, integrated with OpenAI function calling API, yields more reliable tool orchestration than unstructured code or pure function calls. JSON structure eliminates markdown parsing issues, enforces planning before action, and maintains state across tool calls for composability. Benchmark results across SmolBench suites GAIA, MATH, SimpleQA, and Frames show 2-7 percentage point gains for capable models, with OpenAI, Claude 3.7 Sonnet, and Qwen/Mistral variants benefiting the most; smaller models exhibit a structure tax due to JSON and Python syntax overhead and higher parsing risk. To realize the gains, teams should ensure their pipelines can parse the JSON, execute Python in a controlled environment, and use models with strong instruction-following (32B+ or frontier models).

Affected Systems

CodeAgent

Date: Date not specified
Change type: capability
Severity: info

Structured CodeAgents: JSON thoughts and code for reliable tool calls via OpenAI function calling API

More from Hugging Face

Get alerts for Hugging Face