Open R1 Update #3 — CodeForces-CoTs and IOI benchmark; OlympicCoder-32B/7B lead on CP tasks
AI Impact Summary
Open R1 Update #3 introduces CodeForces-CoTs (≈100k code reasoning samples in C++/Python) and an IOI benchmark pipeline, with OlympicCoder-32B/7B fine-tuned models outperforming several frontier competitors on IOI-style problems. The release, along with open-r1/codeforces and open-r1/ioi tooling, provides end-to-end data, problem statements, grading scripts, and verification tests to accelerate development and benchmarking of code-reasoning models. Teams should expect improved CP-model capabilities but face verifiability gaps at scale, necessitating robust evaluation pipelines and careful data licensing when integrating these assets into production workloads.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info