Kimina-Prover-72B: Test-Time RL Search achieves 92.2% on miniF2F
AI Impact Summary
Kimina-Prover-72B represents a significant advancement in formal theorem proving, leveraging Test-Time RL Search to autonomously discover and apply lemmas for complex mathematical proofs. The model’s performance, achieving 92.2% pass rate on the miniF2F benchmark, demonstrates the effectiveness of this approach. The integration of an error-fixing mechanism and the lemma-enabled pattern highlights a shift towards more efficient and robust reasoning strategies, moving beyond simple generation towards a more human-like problem-solving process.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info