InfoCapability

Together Deployments: Custom Metric Autoscaling

AI Impact Summary

Together Deployments now allows users to scale their applications based on custom Prometheus metrics exposed by worker endpoints. This expands scaling capabilities beyond the standard Together AI metrics, enabling teams to react to application-specific signals like vllm:num_requests_running, providing greater control over resource allocation and application performance. This change introduces a new configuration option and requires monitoring of the selected metric for effective scaling.

Affected Systems

Together DeploymentsPrometheus

Date: Date not specified
Change type: capability
Severity: info

Together Deployments: Custom Metric Autoscaling

More from Together AI

Get alerts for Together AI