Liger GRPO training with Qwen2.5-0.5B-Instruct encounters shape mismatch
AI Impact Summary
A shape mismatch occurred during the training of the Liger GRPO model using Qwen/Qwen2.5-0.5B-Instruct with deepspeed zero3 and bf16 precision. This indicates an issue with the data or model configuration, likely related to tensor shapes not aligning during the forward pass within the `compute_liger_loss` function. The error highlights a potential problem with the implementation of the LigerFusedLinearGRPOFunction, requiring investigation into the data preprocessing, model architecture, or the training loop itself to resolve the shape incompatibility.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info