aMUSEd: Efficient non-diffusion text-to-image model released with diffusers integration
AI Impact Summary
aMUSEd introduces an efficient non-diffusion text-to-image model using Masked Image Modeling, offering faster inference and improved interpretability versus latent diffusion. It integrates with diffusers via AmusedPipeline and supports on-device workloads thanks to its ~800M parameter size and LoRA-enabled fine-tuning, with additional capabilities like inpainting through AmusedInpaintPipeline. The model is released as a research preview under an OpenRAIL license, enabling commercial adaptation but requiring compliance checks and careful evaluation of image quality against established baselines like SDXL.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info