Hugging Face: ZeRO memory optimizations via DeepSpeed and FairScale in transformers v4.2.0+ enable larger models on single or multi-GPU setups | SignalBreak | SignalBreak