InfoCapability

Modular: Semantic Search with MAX Engine — BGE-base-en-v1.5 & Batch Inference

AI Impact Summary

This document details the implementation of a semantic search engine leveraging the MAX Engine, specifically utilizing the BGE-base-en-v1.5 model and Amazon Multilingual Counterfactual Dataset. The core innovation lies in batching inference with MAX Engine, demonstrating performance improvements of up to 2.8x compared to PyTorch and ONNX runtime, particularly at smaller batch sizes on CPUs. This suggests a significant optimization opportunity for deploying this semantic search solution.

Affected Systems

MAX EngineBGE-base-en-v1.5

Date: Date not specified
Change type: capability
Severity: info

Modular: Semantic Search with MAX Engine — BGE-base-en-v1.5 & Batch Inference

More from Modular MAX

Get alerts for Modular MAX