MediumCapability

NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models

AI Impact Summary

This competition, hosted by the Text-to-Intelligence Institute (TII), is focused on developing new benchmarks to evaluate early-stage training of Large Language Models (LLMs). The goal is to capture relevant signals during the initial training phases, particularly for scientific knowledge, which existing benchmarks fail to do. Participants will use the lm-evaluation-harness library and submit solutions via a Hugging Face Space, with a leaderboard and automated scoring based on signal quality, ranking consistency, and compliance with scientific knowledge.

Affected Systems

Hugging Facelm-evaluation-harness

Date: Date not specified
Change type: capability
Severity: medium

NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models

More from Hugging Face

Get alerts for Hugging Face