InfoCapability

Accelerate ND-Parallel: Multi-GPU training with FSDP/TP/DP using Accelerate and Axolotl

AI Impact Summary

The guide demonstrates configuring multi-GPU training with ND-Parallel in Accelerate by combining DP, TP, and FSDP, leveraging FullyShardedDataParallelPlugin and ParallelismConfig. This enables training very large models (e.g., Hermes-3-Llama-3.1-8B) across multiple GPUs and nodes by sharding weights and distributing data, reducing per-device memory pressure. Teams can adopt the provided example configs and Axolotl integration to optimize throughput, but must manage the added complexity of coordinating multiple parallelism strategies and device meshes.

Affected Systems

AccelerateAxolotl

Date: Date not specified
Change type: capability
Severity: info

Accelerate ND-Parallel: Multi-GPU training with FSDP/TP/DP using Accelerate and Axolotl

More from Hugging Face

Get alerts for Hugging Face