Modular + AMD: Achieve 53% Faster Inference on MI300/MI325 GPUs
AI Impact Summary
Modular has announced a partnership with AMD to enable the Modular Platform on AMD GPUs, specifically the MI300 and MI325 series. This partnership delivers significant performance improvements, up to 53% better throughput on prefill-heavy workloads with Llama 3.1 and Gemma 3, and 32% better throughput on decode-heavy workloads, compared to existing open-source AI infrastructure. The Modular Platform, leveraging the MAX inference server and the Mojo programming language, offers a hardware-agnostic approach, allowing developers to deploy across NVIDIA and AMD GPUs without code modifications.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info