InfoCapability

Open R1 Update #3 expands CodeForces-CoTs and IOI benchmarking with OlympicCoder models

AI Impact Summary

Open R1 is releasing CodeForces-CoTs (≈100k chain-of-thought samples) and an IOI benchmark, plus fine-tuned OlympicCoder 7B and 32B models that reportedly outperform several frontier models on IOI problems. The dataset combines CodeForces problems, editorials, and correct solutions, with a verifiability-focused approach evidenced by published test cases and a manager pipeline for running evaluations via open-r1/ioi and open-r1/ioi-test-cases. This creates a reproducible, publicly-accessible evaluation stack for code-reasoning models, enabling faster iteration, benchmarking against large language models, and informed model selection for code-generation tasks. However, licensing (CodeForces data, IOI CC-BY) and verifiability considerations should be reviewed before production use, and internal teams should validate that these data sources align with policy and commercialization goals.

Affected Systems

CodeForces-CoTs

Date: Date not specified
Change type: capability
Severity: info

Open R1 Update #3 expands CodeForces-CoTs and IOI benchmarking with OlympicCoder models

More from Hugging Face

Get alerts for Hugging Face