MoAA: Mixture-of-Agents Alignment improves Llama-3.1 performance
AI Impact Summary
The Mixture-of-Agents Alignment (MoAA) technique leverages the collective intelligence of open-source LLMs, specifically Llama-3.1-8B-Instruct and Gemma-2-9B-it, to achieve superior post-training performance compared to GPT-4o. This approach utilizes synthetic data generated by a Mixture-of-Agents ensemble, followed by Direct Preference Optimization (DPO) with a MoAA-based reward model, resulting in models that outperform larger models like Llama-3.1-70B-Instruct and demonstrates a cost-effective alternative to using models like GPT-4o.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info