Gemma 4 Frontier multimodal models released on Hugging Face for on-device use
AI Impact Summary
Google DeepMind's Gemma 4 family is now available on Hugging Face under an Apache 2 license, enabling on-device multimodal inference across image, text, and audio inputs. The lineup includes E2B, E4B, 31B dense, and 26B MoE variants with long-context support (128K–256K tokens) and efficiency features like Shared KV Cache and Per-Layer Embeddings. They integrate with transformers, llama.cpp, MLX, WebGPU, and Rust, enabling deployment from edge devices to lightweight inference servers for OCR, speech-to-text, and object detection.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium