AWS OpenSearch Upgrades and Google Vertex AI Expansions: Week of 24 March 2025
AWS OpenSearch Upgrades and Google Vertex AI Expansions: Week of 24 March 2025
AWS is making significant moves in the search space this week, introducing new instance types that promise substantial performance gains whilst simultaneously updating IAM policies that could catch teams off guard. Meanwhile, Google continues its aggressive Vertex AI expansion with cost-saving options and new model availability.
The Big Moves
AWS OpenSearch Gets a Performance Boost with OR2 and OM2 Instances
Amazon has introduced OR2 and OM2 instance types for OpenSearch Service, delivering performance improvements that should have search teams taking notice. The OR2 instances show up to 26% better indexing throughput compared to their predecessors and a whopping 70% improvement over R7 instances. OM2 instances aren't slouching either, offering 15% better performance than OR1 and 66% over M7g.
These aren't just incremental updates. For organisations running vector search workloads, particularly those leveraging GPU acceleration, the performance gains could translate to meaningful cost savings and improved user experience. The timing is strategic too, as vector search adoption continues to accelerate across enterprise applications.
What makes this particularly interesting is how it positions AWS against competitors in the search infrastructure space. Whilst Elastic continues to push its cloud offerings, AWS is clearly betting that raw performance improvements will keep customers locked into their ecosystem. For teams currently experiencing bottlenecks in their search infrastructure, the migration path to these new instances should be straightforward, but you'll want to benchmark your specific workloads to quantify the benefits.
AWS IAM Policy Updates Could Impact OpenSearch Access Controls
Less visible but potentially more impactful is AWS's update to the AmazonOpenSearchServiceRolePolicy managed policy, effective 28 March. This change enables OpenSearch to update the access scope of AWS IAM Identity Center applications, particularly those managed solely by OpenSearch.
The devil is in the details here. Whilst this update is positioned as an enhancement to security and operational capabilities, it represents a significant change in how access controls function. Teams that have built custom access control workflows around OpenSearch may find their existing configurations need review and potential adjustment.
This is the type of change that can slip through the cracks during routine updates but cause headaches down the line. If you're running OpenSearch with IAM Identity Center integration, now would be a good time to audit your access controls and ensure they align with the new policy capabilities. The last thing you want is to discover access control gaps during a security audit or, worse, a security incident.
Google Vertex AI Embraces Spot VMs for Cost Optimisation
Google has enabled Spot VM support for Vertex AI training and prediction jobs, offering a compelling cost reduction opportunity for teams willing to accept some operational complexity. Spot VMs leverage excess Compute Engine capacity at reduced prices, but Google can reclaim this capacity at any time.
For training workloads that can tolerate interruptions, this could represent significant cost savings. However, the key consideration is workload design. Teams will need to implement proper checkpointing and restart mechanisms to handle the inevitable interruptions. For prediction jobs, the calculus is different. Unless you have robust failover mechanisms, using Spot VMs for production inference could introduce unacceptable availability risks.
The broader strategic play here is Google's continued push to make Vertex AI more cost-competitive against alternatives like AWS SageMaker and Azure Machine Learning. By offering multiple pricing tiers, Google is trying to capture both cost-conscious teams and those requiring guaranteed availability.
Worth Watching
Google Expands Vertex AI Model Garden with New Options
Vertex AI Model Garden has gained three new models: DeepSeek-V3-0324, TxGemma, and Sesame CSM. More importantly, existing DeepSeek models now support deployment on H200 GPUs with improved vLLM support. This expansion gives developers more flexibility in model selection whilst leveraging Google's infrastructure. For teams evaluating model options, the H200 GPU support could provide meaningful performance improvements for inference workloads.
Vertex AI Adds GPU Reservations and Workbench Backup
Google has introduced GPU reservation support and Workbench backup/restore functionality. GPU reservations address a common pain point in ML workflows: resource availability bottlenecks that can derail training schedules. The backup/restore capability for Workbench instances provides essential data protection that was previously missing. These aren't flashy features, but they address real operational challenges that ML teams face daily.
Amazon Bedrock Integrates with SageMaker Unified Studio
Amazon has rebranded Bedrock Studio and integrated it into SageMaker Unified Studio. Whilst functionality remains unchanged, users will need to adjust their workflows to access Bedrock through the new interface. This consolidation reflects Amazon's broader strategy of unifying its ML tools under the SageMaker umbrella, but it does mean workflow changes for existing users.
Bedrock Knowledge Bases Supports OpenSearch Managed Clusters
Amazon Bedrock now supports OpenSearch Managed Clusters as a vector store option for Knowledge Bases. This provides a managed alternative for teams that want to leverage OpenSearch's vector capabilities without the operational overhead. The integration should be straightforward for teams already using OpenSearch, offering a clear upgrade path.
Replicate Ships Cog 0.14 with Async Support
Replicate has released Cog 0.14 with async/await concurrent prediction support. This enables better performance for concurrent model inference workloads, allowing users to handle more requests simultaneously with potentially reduced latency. For teams running high-throughput inference workloads on Replicate, this could provide meaningful performance improvements.
Quick Hits
• Elastic releases Elasticsearch 9.0.0-rc1: The release candidate is available for testing ahead of the major version release • Elasticsearch maintenance updates: Versions 8.16.6 and 8.17.4 provide bug fixes and performance improvements • Amazon Bedrock Flows templates: New flow templates simplify setup for common use cases
The Week Ahead
Keep an eye on the AWS IAM policy changes taking effect on 28 March. If you're running OpenSearch with IAM Identity Center, use the weekend to review your access controls. The Elasticsearch 9.0 final release should be imminent given the release candidate availability, so start planning your migration testing if you haven't already.
Google's continued Vertex AI expansion suggests more announcements are likely as they push to capture market share in the enterprise AI space. With the performance improvements from AWS and Google's cost optimisation options, the competitive pressure on other providers is only going to intensify.