InfoCapability

Together AI Fine-Tuning Platform adds Direct Preference Optimization and continued training

AI Impact Summary

Together AI unveils the Fine-Tuning Platform with browser-based fine-tuning, Direct Preference Optimization (DPO), and continued training from prior runs. This enables ongoing model refinement for open-weight models like Llama and Gemma, using preference data and multi-stage training to better reflect user expectations and domain specifics. The update introduces per-message weights, cosine LR schedulers, data preprocessing speedups, and a new pricing model, with proof points from Protege AI and references to models such as ShieldLlama and DeepSeek-R1. Enterprises can now iterate and deploy updated checkpoints via Together or download local checkpoints, accelerating time-to-value while increasing the need for governance around continuous model evolution.

Affected Systems

Together AI Fine-Tuning PlatformDirect Preference Optimization (DPO)

Date: Date not specified
Change type: capability
Severity: info

Together AI Fine-Tuning Platform adds Direct Preference Optimization and continued training

More from Together AI

Get alerts for Together AI