Hugging Face: Hugging Face Transformers adds KV Cache Quantization to extend long-context generation for Llama-2 models | SignalBreak | SignalBreak