Rocket Money scales 4K+ class BERT model in production with Hugging Face Inference API — outages and scale lessons
AI Impact Summary
Rocket Money migrated from regex-based normalizers to a 4,000+-class BERT-based text classifier, hosted via Hugging Face Inference API to scale production enrichment of merchant strings. The move leveraged GCP for data storage and Google Vertex Pipelines for training, with an extended evaluation before routing increasing transaction load to the hosted models. Production issues included outages when adding classes and caching problems during model switchover, highlighting the need for robust telemetry, cache coordination, and SLA-focused ops as traffic grows.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info