MLX: Laguna Model Support & FP8 Quantization Updates (v0.22.1-rc0)
AI Impact Summary
Version 0.22.1-rc0 introduces support for the Laguna model, alongside significant quantization improvements focused on FP8 safetensors. The core change addresses a bug where logprobs were being dropped during generation requests when using builtin parsers, now preserved to maintain accurate context. This update impacts the model serving infrastructure and quantization pipelines, requiring careful monitoring of model performance and logprob accuracy.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info