Hugging Face: BLOOM-176B Inference with DeepSpeed Inference and Accelerate — sub-1ms per-token on 8x80GB A100 | SignalBreak | SignalBreak