Together AI Fine-Tuning Platform adds Direct Preference Optimization and continued training
AI Impact Summary
Together AI unveils the Fine-Tuning Platform with browser-based fine-tuning, Direct Preference Optimization (DPO), and continued training from prior runs. This enables ongoing model refinement for open-weight models like Llama and Gemma, using preference data and multi-stage training to better reflect user expectations and domain specifics. The update introduces per-message weights, cosine LR schedulers, data preprocessing speedups, and a new pricing model, with proof points from Protege AI and references to models such as ShieldLlama and DeepSeek-R1. Enterprises can now iterate and deploy updated checkpoints via Together or download local checkpoints, accelerating time-to-value while increasing the need for governance around continuous model evolution.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info