Hugging Face: Differential Transformer V2 (DIFF V2) improves inference speed and training stability for production-scale LLMs | SignalBreak | SignalBreak