Hugging Face launches Inference Providers on the Hub
AI Impact Summary
Hugging Face is launching a new integration with serverless inference providers like fal, Replicate, Sambanova, and Together AI, directly on the Hub’s model pages and client SDKs. This allows developers to easily explore and prototype serverless inference without managing dedicated infrastructure, offering a streamlined workflow for model experimentation and deployment. The introduction of a routing proxy simplifies API calls and billing, aligning with standard OpenAI practices while leveraging a diverse range of provider capabilities.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info