InfoCapability

NuminaMath 7B TIR wins 1st AIMO Progress Prize with tool-integrated reasoning

AI Impact Summary

NuminaMath 7B TIR is presented as a reasoning agent that employs tool-integrated reasoning with Python execution feedback (SC-TIR) to solve math problems. The solution was developed via a two-stage fine-tuning recipe on DeepSeekMath-Base 7B, using CoT-style templates and code execution, trained with TRL, PyTorch, vLLM, and DeepSpeed, with validation against private leaderboards; the private test achieved 29/50. This demonstrates that open-weight models can attain meaningful math reasoning performance when augmented with tools and carefully designed decoding. The collaboration with Hugging Face signals a scalable blueprint for open AI4Maths efforts and potential commercialization avenues around math-enabled LLMs.

Affected Systems

NuminaMath 7B TIRDeepSeekMath-Base 7B

Date: Date not specified
Change type: capability
Severity: info

NuminaMath 7B TIR wins 1st AIMO Progress Prize with tool-integrated reasoning

More from Hugging Face

Get alerts for Hugging Face