MediumCapability

Gemma 4 Frontier multimodal models released on Hugging Face for on-device use

AI Impact Summary

Google DeepMind's Gemma 4 family is now available on Hugging Face under an Apache 2 license, enabling on-device multimodal inference across image, text, and audio inputs. The lineup includes E2B, E4B, 31B dense, and 26B MoE variants with long-context support (128K–256K tokens) and efficiency features like Shared KV Cache and Per-Layer Embeddings. They integrate with transformers, llama.cpp, MLX, WebGPU, and Rust, enabling deployment from edge devices to lightweight inference servers for OCR, speech-to-text, and object detection.

Affected Systems

Gemma 4 E2BGemma 4 E4B

Date: Date not specified
Change type: capability
Severity: medium

Gemma 4 Frontier multimodal models released on Hugging Face for on-device use

More from Hugging Face

Get alerts for Hugging Face