InfoCapability

Ettin Suite: SoTA Paired Encoder and Decoder Models Trained with Identical Data and Recipe

AI Impact Summary

Ettin Suite introduces state-of-the-art paired encoder-only and decoder-only models trained using identical data, model sizes, and training recipes, enabling apples-to-apples comparisons across architectures. The project uses the ModernBERT recipe across six sizes (17M–1B) and a three-phase training schedule (1.7T pre-training, 250B context-extension, 100B decay) with open, reproducible data sources, delivering encoder models that beat ModernBERT and decoder models that beat Llama 3.2 and SmolLM2. This setup also exposes architecture-specific advantages and shows cross-objective pretraining is not universally beneficial, with public artifacts like jhu-clsp/ettin-encoder-150m available for quick adoption.

Affected Systems

Ettin encoderEttin decoder

Date: Date not specified
Change type: capability
Severity: info

Ettin Suite: SoTA Paired Encoder and Decoder Models Trained with Identical Data and Recipe

More from Hugging Face

Get alerts for Hugging Face