InfoCapability

OpenVINO NNCF-based optimization of Stable Diffusion on Intel CPUs with ToME and Diffusers

AI Impact Summary

The article describes a CPU-centric optimization workflow for Stable Diffusion using OpenVINO, NNCF, Diffusers, and Token Merging (ToME). It explains that UNet is the bottleneck and that traditional post-training 8-bit quantization is insufficient, necessitating Quantization-Aware Training with knowledge distillation and EMA to preserve accuracy. Reported results show substantial CPU inference speedups (up to 5.1x with ToME + 8-bit) and a 4x reduction in model footprint versus PyTorch, enabling feasible edge/CPU deployments on Intel Xeon with Deep Learning Boost. This signals a viable migration path for CPU-only inference pipelines, but requires a careful training and tuning workflow to maintain image quality across target prompts and steps.

Affected Systems

Stable DiffusionUNet (Stable Diffusion UNet)

Date: Date not specified
Change type: capability
Severity: info

OpenVINO NNCF-based optimization of Stable Diffusion on Intel CPUs with ToME and Diffusers

More from Hugging Face

Get alerts for Hugging Face