InfoCapability

Hugging Face Inference for PROs: exclusive endpoints and higher rate limits for curated models

AI Impact Summary

Hugging Face is launching Inference for PROs, granting PRO users access to exclusive HTTP endpoints for a curated suite of models with higher rate limits than the free tier. This enables rapid prototyping against top-performing models (Meta Llama 3 Instruct, Mixtral, Nous Hermes 2 Mixtral, Zephyr, Llama 2 Chat, Mistral 7B Instruct, Code Llama variants, Stable Diffusion XL 3B UNet, Bark) without deploying infrastructure. The offering is intended for experimentation, with production workloads still recommended to use Inference Endpoints, and access is tied to PRO authentication. This could drive higher PRO-tier adoption and increased API call volumes from developers evaluating multiple models quickly.

Affected Systems

Hugging Face Inference APIInference Endpoints

Date: Date not specified
Change type: capability
Severity: info

Hugging Face Inference for PROs: exclusive endpoints and higher rate limits for curated models

More from Hugging Face

Get alerts for Hugging Face