Hugging Face Hub adopts CDC-backed chunking with Xet-backed storage to replace Git LFS
AI Impact Summary
Hugging Face is moving from Git LFS storage to a CDC-based approach that chunks files into content-defined segments stored in a content-addressed store. By deduplicating identical chunks and only uploading modified segments, this reduces both storage footprint and transfer time, enabling faster iteration on large assets like safetensors (~1 GB) and GGUF files (multi-GB). The program includes an Xet-backed storage POC with compression, with a rollout planned for early 2025 and ongoing work to scale CDC across globally distributed repositories while addressing privacy boundaries.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info