Hugging Face: Fine-tuning 20B LLMs with RLHF on a 24GB GPU using TRL and PEFT | SignalBreak | SignalBreak