Hugging Face: Custom AMD MI300X kernels accelerate Llama 3.1 405B inference in VLLM | SignalBreak | SignalBreak