InfoCapability

VDR-2b-multi-v1 enables multilingual visual document retrieval without OCR

AI Impact Summary

The vdr-2b-multi-v1 release introduces a multilingual visual document retrieval embedding model that covers Italian, Spanish, English, French, and German, enabling OCR-free search across multilingual visual pages. It claims to deliver 3x faster inference and lower VRAM usage compared to larger base models, by using a smaller image-patch configuration on ViDoRe benchmarks, and supports cross-lingual retrieval (e.g., search German docs with Italian queries). Integration assets include the model path llamaindex/vdr-2b-multi-v1 for HuggingFaceEmbedding and SentenceTransformer, the vdr-multilingual-train dataset, and the vdr-multilingual-test evaluation set on Hugging Face; plan to update downstream search pipelines to consume single-vector embeddings instead of OCR pipelines.

Affected Systems

vdr-2b-multi-v1vdr-2b-v1

Date: Date not specified
Change type: capability
Severity: info

VDR-2b-multi-v1 enables multilingual visual document retrieval without OCR

More from Hugging Face

Get alerts for Hugging Face