Hugging Face and Google Cloud deepen partnership to run open models on Vertex AI, GKE, and Cloud Run
AI Impact Summary
Google Cloud and Hugging Face broaden collaboration to host and serve open models across Vertex AI Model Garden, GKE, Cloud Run, and Compute Engine, with a new CDN Gateway caching Hugging Face repositories. The integration enables TPUs for accelerating inference and makes private enterprise models easier to host securely on Google Cloud, while preserving Hugging Face Inference Endpoints as a deployment option. Expect faster time-to-first-token and improved model governance driven by Google Threat Intelligence, Mandiant, and VirusTotal security tooling. This partnership shifts hosting and deployment of open-model workloads toward Google Cloud, expanding options for customers building AI on open models.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info