InfoCapability

PyTorch Transformers acceleration on Sapphire Rapids with IPEX and CCL

AI Impact Summary

Sapphire Rapids introduces AMX to accelerate matrix operations in DL workloads. PyTorch training on CPU can leverage IPEX and CCL to automatically utilize these instructions without changing model code, as described for a Hugging Face transformers workflow. The setup uses bare-metal AWS r7iz nodes with a patched Linux kernel to enable AMX, delivering potential speedups and cost benefits over GPU-centric training, especially when using CPU spot instances. To realize this, teams must verify Sapphire Rapids hardware, install IPEX and CCL at compatible versions, and enable bf16 or int8 modes; a future post will cover inference performance.

Affected Systems

Intel Sapphire Rapids (Xeon Scalable CPUs)AMX (Advanced Matrix Extensions)

Date: Date not specified
Change type: capability
Severity: info

PyTorch Transformers acceleration on Sapphire Rapids with IPEX and CCL

More from Hugging Face

Get alerts for Hugging Face