Bias in Text-to-Image Models on Hugging Face Hub — sources, tools, and red-teaming
AI Impact Summary
The newsletter analyzes biases across Text-to-Image (TTI) systems, detailing bias sources from training data (LAION-5B, MS-COCO, VQA v2.0) to pre-training filters, latent space, and post-hoc safety filters, with concrete examples across Stable Diffusion, Dall-E 2, and CLIP. It also presents practical auditing tools (Average Diffusion Faces, Face Clustering, Colorfulness Profession Explorer) and notes red-teaming challenges and resource gaps, indicating that bias evaluation remains ad-hoc and non-standardized. For engineering and product teams, this underscores the need to embed bias evaluation into model QA, maintain diverse prompts, and implement governance and guardrails to prevent biased outputs in production deployments.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info