Hugging Face: Co-located vLLM in TRL enables shared-GPU training and inference for GRPO | SignalBreak | SignalBreak