InfoCapability

Accelerating 130,000+ Hugging Face Models with ONNX Runtime

AI Impact Summary

ONNX Runtime is significantly accelerating the performance of over 130,000 Hugging Face models, primarily those with ONNX support. This includes popular LLMs like GPT2 and BERT, offering potential latency improvements – as demonstrated with whisper-tiny achieving a 74.30% gain over PyTorch. The tight integration with Hugging Face ensures ongoing support for a growing number of model architectures, representing a key optimization path for deploying these models at scale.

Affected Systems

Hugging FaceONNX Runtime

Date: Date not specified
Change type: capability
Severity: info

Accelerating 130,000+ Hugging Face Models with ONNX Runtime

More from Hugging Face

Get alerts for Hugging Face