InfoCapability

Modular + AMD: Achieve 53% Faster Inference on MI300/MI325 GPUs

AI Impact Summary

Modular has announced a partnership with AMD to enable the Modular Platform on AMD GPUs, specifically the MI300 and MI325 series. This partnership delivers significant performance improvements, up to 53% better throughput on prefill-heavy workloads with Llama 3.1 and Gemma 3, and 32% better throughput on decode-heavy workloads, compared to existing open-source AI infrastructure. The Modular Platform, leveraging the MAX inference server and the Mojo programming language, offers a hardware-agnostic approach, allowing developers to deploy across NVIDIA and AMD GPUs without code modifications.

Affected Systems

Modular PlatformMAX inference server

Date: Date not specified
Change type: capability
Severity: info

Modular + AMD: Achieve 53% Faster Inference on MI300/MI325 GPUs

More from Modular MAX

Get alerts for Modular MAX