InfoCapability

Hugging Face adds serverless Inference Providers to Hub model pages: fal-ai, Replicate, Sambanova, Together AI

AI Impact Summary

Hugging Face is introducing a unified serverless inference layer by adding fal-ai, Replicate, SambaNova, and Together AI as integrated providers on model pages and in the JS/Python SDKs. This enables developers to run inferences from multiple providers through a single route—either with provider keys or via routing through Hugging Face—without rewriting clients. The routing proxy at router.huggingface.co/{provider} and the provided code samples illustrate cross-provider usage, billing flows, and key management, including HF credits for PRO users and a free quota for signed-in users, which will influence usage patterns and cost planning.

Affected Systems

fal-aiReplicate

Date: Date not specified
Change type: capability
Severity: info

Hugging Face adds serverless Inference Providers to Hub model pages: fal-ai, Replicate, Sambanova, Together AI

More from Hugging Face

Get alerts for Hugging Face