MTEB Benchmark: Massive Text Embedding Benchmark with 56 datasets and multilingual leaderboard
AI Impact Summary
MTEB provides a comprehensive, multilingual benchmark for evaluating text embedding models, aggregating 56 datasets across 8 tasks and a public leaderboard. The mteb library and GitHub repo enable teams to benchmark their own embeddings (e.g., SentenceTransformer models) and submit results alongside state-of-the-art models such as all-mpnet-base-v2, all-MiniLM-L6-v2, ST5-XXL, GTR-XXL, and SGPT-5.8B-msmarco, enabling apples-to-apples comparisons. Since downstream NLP tasks rely on embedding quality, this setup drives standardized evaluation and will influence model selection, with trade-offs in embedding size and storage (e.g., 4096-dim embeddings).
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info