Hugging Face Dataset Hub adds modality, size, format, and library-based Dataset Search features
AI Impact Summary
Hugging Face is expanding Dataset Hub capabilities with four new Dataset Search features (modality, size, format, library) and an integrated Dataset Viewer, enhancing how teams discover and evaluate open datasets. Modality is auto-detected and filterable across Text, Image, Audio, Tabular, Time-Series, 3D, Video, and Geospatial, while size search can constrain results by min/max rows and uses a 5GB sample to estimate totals for very large datasets (e.g., >10B rows). The new filters, plus compatibility cues with Pandas, Dask, and the π€ Datasets library, can be combined with existing Language, Tasks, and Licenses filters to accelerate dataset selection for training, evaluation, and benchmarking.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info