InfoCapability

AllenAI releases EMO: Pretraining mixture of experts for emergent modularity

AI Impact Summary

AllenAI has released EMO, a novel mixture-of-experts model pretrained on 1 trillion tokens with a focus on emergent modularity. EMO’s architecture, featuring 1B active and 14B total parameters, allows for selective expert use – leveraging just 12.5% of the experts while maintaining near full-model performance. This represents a significant advancement over traditional MoE models, which often struggle with performance degradation when using smaller expert subsets, offering a potential pathway to more efficient and flexible large language model deployment.

Affected Systems

EMO

Business Impact

Organizations can leverage EMO's selective expert usage to reduce computational costs and memory requirements for large language models, enabling deployment in resource-constrained environments.

Date: Date not specified
Change type: capability
Severity: info

AllenAI releases EMO: Pretraining mixture of experts for emergent modularity

More from Hugging Face

Get alerts for Hugging Face