InfoCapability

Fine-Tuning Gemma Models in Hugging Face with LoRA and 4-bit QLoRA

AI Impact Summary

Open Gemma models (2B and 7B) are now accessible on Hugging Face with support for parameter-efficient fine-tuning (PEFT) using LoRA and QLoRA, enabling fine-tuning without full-weight training. The workflow leverages Hugging Face Transformers and the PEFT library, with 4-bit quantization via bitsandbytes and acceleration on GPU and TPU through PyTorch/XLA FSDP. Access to model artifacts requires consenting to the provided form, and deployment paths include Vertex Model Garden and Google Kubernetes Engine for production use. This combination lowers memory and compute barriers, speeding domain adaptation and experimentation for enterprise NLP tasks.

Affected Systems

google/gemma-2bGemma-7B

Date: Date not specified
Change type: capability
Severity: info

Fine-Tuning Gemma Models in Hugging Face with LoRA and 4-bit QLoRA

More from Hugging Face

Get alerts for Hugging Face