Google Gemma open LLM family released; supports Vertex AI, GKE, and Hugging Face deployment
AI Impact Summary
Google unveils Gemma, a family of open LLMs in 2B and 7B scales with base and instruct variants and 8K context, designed to run on consumer GPUs/CPUs without quantization. The rollout includes full Hugging Face integration, four open-access models on the Hub, and deployment pathways through Vertex AI, GKE, and Hugging Face Inference Endpoints, plus tooling around transformers, PEFT, and 4-bit quantization. This broad accessibility lets teams experiment and deploy open LLMs at smaller scales, but requires careful evaluation of performance, licensing, data governance, and the tradeoffs between on-device versus cloud deployments.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info