VDR-2b-multi-v1 enables multilingual visual document retrieval without OCR
AI Impact Summary
The vdr-2b-multi-v1 release introduces a multilingual visual document retrieval embedding model that covers Italian, Spanish, English, French, and German, enabling OCR-free search across multilingual visual pages. It claims to deliver 3x faster inference and lower VRAM usage compared to larger base models, by using a smaller image-patch configuration on ViDoRe benchmarks, and supports cross-lingual retrieval (e.g., search German docs with Italian queries). Integration assets include the model path llamaindex/vdr-2b-multi-v1 for HuggingFaceEmbedding and SentenceTransformer, the vdr-multilingual-train dataset, and the vdr-multilingual-test evaluation set on Hugging Face; plan to update downstream search pipelines to consume single-vector embeddings instead of OCR pipelines.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info