InfoCapability

Habana Gaudi2 delivers ~2x training/inference speed vs Nvidia A100 80GB across BERT, Stable Diffusion, and T5-3B

AI Impact Summary

The article benchmarks Habana Gaudi2 against Nvidia A100 80GB across training and inference workloads, including BERT pre-training, Stable Diffusion inference, and T5-3B fine-tuning. Gaudi2 is reported to be roughly 2x faster than A100 for these tasks, aided by higher per-device memory and seamless software compatibility via SynapseAI and 🤗 Optimum Habana. The workflows with Gaudi and Gaudi2 map 1:1 to the Optimum Habana interface, enabling migration without code changes. Access through the Intel Developer Cloud indicates a cloud-first path to evaluating this hardware, with implications for procurement and cost models.

Affected Systems

Gaudi2NVIDIA A100 80GB

Date: Date not specified
Change type: capability
Severity: info

Habana Gaudi2 delivers ~2x training/inference speed vs Nvidia A100 80GB across BERT, Stable Diffusion, and T5-3B

More from Hugging Face

Get alerts for Hugging Face