Cohere models added to Hugging Face Inference Providers — serverless inference and tool-use support
AI Impact Summary
Cohere models are now available as an Inference Provider on Hugging Face Hub, enabling serverless inference via HF Hub routing for Cohere endpoints. The lineup includes CohereLabs/c4ai-command-a-03-2025 (256k context, RAG with verifiable citations, multilingual, and tool-use), CohereLabs/aya-expanse-32b (multilingual with 128k context), CohereLabs/c4ai-command-r7b-12-2024 (128k context, low-cost/low-latency, multilingual, tool-use), and CohereLabs/aya-vision-32b (32B vision-language). This integration expands enterprise deployment options, simplifies access through the Inference Providers workflow, and enables agentic tooling directly from Cohere models, but requires developers to route requests through the cohere provider using the Hugging Face InferenceClient or compatible clients.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info