Hugging Face Hub migrates repositories from LFS to Xet storage with CDC deduplication
AI Impact Summary
Hugging Face is putting Xet storage into production on the Hub, replacing LFS for large repository files and using content-defined chunking (64 KB chunks) to deduplicate data. The migration to Xet reduced per-change uploads to only the modified chunks, and early runs show material gains: 4.5 TB migrated, ~6% of download traffic moved to Xet, and ~35% lower GET latency after block-format optimizations. The system ties together a Xet-aware client, CAS, LFS Bridge, and S3, with ongoing work focusing on optimizing range-based downloads (hf_transfer) and balancing load across CAS pods. This enables faster large-file transfers and better scalability, but requires continued client updates and monitoring during broader rollout.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info