Fetch Cuts ML Processing Latency by 50% Using SageMaker & Hugging Face
AI Impact Summary
Fetch migrated and optimized its ML pipeline using Amazon SageMaker and Hugging Face, leveraging multi-GPU inference and automated model tuning to cut the slowest scan latency by 50%. The deployment included SageMaker features like Processing, Model Training, and Inference Recommender, plus Hugging Face tooling via AWS Deep Learning Containers, enabling scalable, near real-time receipt extraction and data structuring. This modernization supports processing about 80 million receipts per week and lays groundwork for expanding ML use cases (e.g., fraud prevention) while improving accuracy and partner confidence.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info