Ettin Suite: SoTA Paired Encoders and Decoders
AI Impact Summary
Ettin Suite introduces the first apples-to-apples comparison by training both encoder-only and decoder-only models (17M–1B params) using a single open-data recipe derived from ModernBERT on identical data (2T tokens) and a three-phase schedule (pre-training, context extension to 8K, then decay). This matters for engineering teams because it provides publicly available weights (e.g., jhu-clsp/ettin-encoder-150m) and a reproducible setup to benchmark classifiers, retrievers, and generators under the same data and training regime, enabling direct cross-architecture evaluation. Early results indicate encoders excel at classification and retrieval while decoders dominate generation at larger scales, informing architecture choice and deployment strategies. The open data and boilerplates lower the barrier for experiments and potential on-device use, supporting faster iteration across product tasks.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info