Tomofun migrates to AWS Inferentia2 for cost-effective pet behavior detection
AI Impact Summary
Tomofun is migrating its pet behavior detection inference workload from GPU-based EC2 instances to EC2 Inf2 instances powered by AWS Inferentia2 to significantly reduce costs. This shift leverages the cost-effectiveness of Inferentia2 for always-on, real-time inference, particularly for models like BLIP, which was originally hosted on GPUs. The architecture utilizes a flexible approach, allowing the API to dynamically route inference requests to either GPU or Inferentia2 instances without requiring extensive code changes, ensuring high availability and scalability.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium