Hugging Face Inference for PROs: exclusive endpoints and higher rate limits for curated models
AI Impact Summary
Hugging Face is launching Inference for PROs, granting PRO users access to exclusive HTTP endpoints for a curated suite of models with higher rate limits than the free tier. This enables rapid prototyping against top-performing models (Meta Llama 3 Instruct, Mixtral, Nous Hermes 2 Mixtral, Zephyr, Llama 2 Chat, Mistral 7B Instruct, Code Llama variants, Stable Diffusion XL 3B UNet, Bark) without deploying infrastructure. The offering is intended for experimentation, with production workloads still recommended to use Inference Endpoints, and access is tied to PRO authentication. This could drive higher PRO-tier adoption and increased API call volumes from developers evaluating multiple models quickly.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info