MediumCapability

Google Gemma 3 opens open-weight multimodal LLMs with 128k context and 140+ languages

AI Impact Summary

Google Gemma 3 introduces open-weight LLMs in four sizes (1B, 4B, 12B, 27B) with multimodal input support and context windows up to 128k tokens. It uses SigLIP for image encoding and implements a pan-and-scan scheme to handle larger images, along with distinct attention mechanisms for text and image inputs. The release, with tight Hugging Face transformers integration, enables rapid experimentation for multilingual and multimodal workflows, but teams must plan deployment strategies and hosting costs based on the desired modality and context length.

Affected Systems

Gemma 3Gemma 2

Date: Date not specified
Change type: capability
Severity: medium

Google Gemma 3 opens open-weight multimodal LLMs with 128k context and 140+ languages

More from Hugging Face

Get alerts for Hugging Face