Kimina-Prover-72B released with Test-Time RL Search for Lean theorem proving (miniF2F pass 92.2%)
AI Impact Summary
Kimina-Prover-72B introduces a trainable Test-Time Reinforcement Learning (TTRL) search that decomposes hard formal proofs into reusable lemmas and selectively applies them, achieving a new state-of-the-art 92.2% pass on miniF2F. The approach combines large backbones (Qwen2.5-72B and variants) with a lemma-enabled pattern and an error-fixing loop that interprets Lean error messages to propose targeted corrections, boosting sample efficiency. Deployment considerations include potential higher inference latency and compute due to the TTRL search, so teams should plan for increased runtime while leveraging Lean 4 contexts and the Kimina-Autoformalizer-7b for lemma translation to maintain reliability and throughput.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info