OpenAI DeepResearch: Open-Source Web Agent with 67% GAIA Accuracy
AI Impact Summary
OpenAI has released DeepResearch, an open-source system leveraging an LLM and an agentic framework to browse the web and answer complex questions. This system demonstrates a significant performance improvement on the GAIA benchmark, achieving near 67% correct answers on 1-shot questions and 47.6% on “level 3” reasoning challenges, outperforming standalone LLMs by a substantial margin. The key innovation is the use of a code-based agentic framework, which allows the LLM to express actions in code, leading to more concise and efficient problem-solving, and a 30% reduction in tokens generated.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info