InfoCapability

Ettin Suite: SoTA Paired Encoders and Decoders

AI Impact Summary

Ettin Suite introduces the first apples-to-apples comparison by training both encoder-only and decoder-only models (17M–1B params) using a single open-data recipe derived from ModernBERT on identical data (2T tokens) and a three-phase schedule (pre-training, context extension to 8K, then decay). This matters for engineering teams because it provides publicly available weights (e.g., jhu-clsp/ettin-encoder-150m) and a reproducible setup to benchmark classifiers, retrievers, and generators under the same data and training regime, enabling direct cross-architecture evaluation. Early results indicate encoders excel at classification and retrieval while decoders dominate generation at larger scales, informing architecture choice and deployment strategies. The open data and boilerplates lower the barrier for experiments and potential on-device use, supporting faster iteration across product tasks.

Affected Systems

jhu-clsp/ettin-encoder-150m

Date: Date not specified
Change type: capability
Severity: info

Ettin Suite: SoTA Paired Encoders and Decoders

More from Hugging Face

Get alerts for Hugging Face