InfoCapability

TRL fix: Average log-likelihood loss for IPO aligns with DPO on 7B models

AI Impact Summary

TRL had an incorrect IPO loss implementation where the log-likelihoods were summed instead of averaged. The PR fixes this by averaging the log-likelihood loss, restoring fidelity to the IPO paper and aligning IPO results with DPO while outperforming KTO in paired-preference tests. In the reported experiments, they evaluated OpenHermes-2.5-Mistral-7B and Zephyr-7b-beta-sft using orca_dpo_pairs and ultrafeedback-binarized datasets, with MT-Bench used for evaluation; results now reflect IPO on par with DPO when hyperparameters like beta are tuned.

Affected Systems

🤗 TRLOpenHermes-2.5-Mistral-7B

Date: Date not specified
Change type: capability
Severity: info

TRL fix: Average log-likelihood loss for IPO aligns with DPO on 7B models

More from Hugging Face

Get alerts for Hugging Face