Hugging Face Kernel Builder enables ROCm kernel development for AMD MI300X GPUs
AI Impact Summary
The guide demonstrates configuring and building ROCm-optimized kernels via Hugging Face's kernel-builder, targeting AMD GPUs and ROCm backends. It walks through a concrete RadeonFlow GEMM kernel (FP8, e4m3fnuz, per-block scaling) for the MI300X and shows how to organize source, headers, and build manifests for PyTorch integration. The emphasis on reproducible environments (flake.nix) and packaging steps enables cross-team sharing and deployment of performance kernels, reducing integration friction.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info