Google Gemma 3 opens open-weight multimodal LLMs with 128k context and 140+ languages
AI Impact Summary
Google Gemma 3 introduces open-weight LLMs in four sizes (1B, 4B, 12B, 27B) with multimodal input support and context windows up to 128k tokens. It uses SigLIP for image encoding and implements a pan-and-scan scheme to handle larger images, along with distinct attention mechanisms for text and image inputs. The release, with tight Hugging Face transformers integration, enables rapid experimentation for multilingual and multimodal workflows, but teams must plan deployment strategies and hosting costs based on the desired modality and context length.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium