Together AI: Prompt Caching Enabled by Default for Dedicated Endpoints
Action Required
Applications using Dedicated Endpoints will automatically benefit from improved performance and reduced costs due to enabled prompt caching, without requiring any user intervention.
AI Impact Summary
Together AI is deprecating the `disable_prompt_cache`, `--no-prompt-cache` CLI flag, and related SDK parameters for Dedicated Endpoints, effectively enabling prompt caching by default. This change aims to improve performance and reduce costs, but requires no action from existing users – simply removing the deprecated parameters from API requests and CLI commands will suffice. This shift represents a fundamental change in how Dedicated Endpoints are configured.
Affected Systems
- Date
- Date not specified
- Change type
- deprecation
- Severity
- high