InfoCapability

Kimina-Prover-72B: Test-Time RL Search achieves 92.2% on miniF2F

AI Impact Summary

Kimina-Prover-72B represents a significant advancement in formal theorem proving, leveraging Test-Time RL Search to autonomously discover and apply lemmas for complex mathematical proofs. The model’s performance, achieving 92.2% pass rate on the miniF2F benchmark, demonstrates the effectiveness of this approach. The integration of an error-fixing mechanism and the lemma-enabled pattern highlights a shift towards more efficient and robust reasoning strategies, moving beyond simple generation towards a more human-like problem-solving process.

Affected Systems

Kimina-Prover-72BQwen2.5-72B

Date: Date not specified
Change type: capability
Severity: info

Kimina-Prover-72B: Test-Time RL Search achieves 92.2% on miniF2F

More from Hugging Face

Get alerts for Hugging Face