Hugging Face: TRL Preference Optimization for Idefics2-8b VLMs — Quantization & LoRA | SignalBreak | SignalBreak