Hugging Face adds Meta Llama 4 Maverick and Scout with Transformers and TGI support
AI Impact Summary
Hugging Face is debuting Meta's Llama 4 Maverick (128 experts, ~400B total) and Llama 4 Scout (16 experts, ~109B total) with native multimodal capabilities and tight Transformer/TGI integration. The release enables direct hosting, loading, and deployment via transformers and Text Generation Inference, with quantization options (Scout int4, Maverick FP8) and the Xet storage backend to accelerate uploads/downloads. Licensing is enforced via the Llama 4 Community License on the model cards, and the MoE architecture combined with extended context lengths (up to 1M for Maverick Instruct, 10M for Scout) implies different performance and resource requirements. Deployment considerations include hardware sizing (Scout can fit on a single server-grade GPU while Maverick typically requires more GPUs) and MoE-specific deployment patterns, affecting cost and scalability planning.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium