Azure OpenAIGoogle Gemini / Vertex AIPerplexityOpenSearch (AWS)ReplicateHugging Face

Azure OpenAI Launches Sora Video Generation: Week of 28 April 2025

28 Apr 2025 – 5 May 20255 min read

Azure OpenAI Launches Sora Video Generation: Week of 28 April 2025

Microsoft has fired the opening shot in the enterprise video generation race with Azure OpenAI's launch of Sora on 1 May. This isn't just another model update, it's a fundamental shift that puts enterprise-grade video creation directly into the hands of Azure customers. Meanwhile, Google's forcing migrations across Vertex AI whilst Perplexity has quietly broken existing integrations with a major API restructure.

The Big Moves

Azure OpenAI Brings Video Generation to Enterprise

Sora's arrival on Azure OpenAI represents Microsoft's most aggressive move yet into creative AI tooling. The service offers text-to-video, video-to-video, and image-to-video generation capabilities, positioning Azure as the first major cloud provider to offer enterprise-grade video generation at scale.

What makes this particularly significant is the timing. Whilst competitors like Runway and Pika Labs have dominated the creative space, Microsoft is betting that enterprises want video generation within their existing cloud infrastructure rather than through standalone creative tools. The integration with existing GPT-4o improvements, including enhanced instruction following and reduced latency, suggests Microsoft is building a comprehensive creative suite rather than bolting on video as an afterthought.

For organisations already invested in the Azure ecosystem, this eliminates the compliance headaches of working with external video generation services. Marketing teams, training departments, and content creators can now generate video content without data leaving their Azure environment. The question isn't whether this will see adoption, it's how quickly competitors will scramble to match it.

Google Forces Vertex AI Endpoint Migration with June 2026 Deadline

Google's making Vertex AI's global endpoint generally available on 2 May, but the real story is the forced migration lurking beneath. Older image and video generation endpoints face deprecation with a hard sunset date of 30 June 2026. This gives organisations just over a year to migrate, but the timeline is tighter than it appears.

The deprecation affects any applications built on Vertex AI's earlier generative models for image and video processing. Google's positioning this as an upgrade to more reliable, production-ready endpoints, but the reality is that applications will simply stop working if teams don't migrate. The 14-month notice period might seem generous, but enterprise migration cycles often stretch longer, particularly when dealing with embedded AI capabilities across multiple systems.

Google's also expanded the Model Garden with Llama 4, HiDream-I1, and Qwen3 models, clearly trying to retain developers by offering more choice. However, the simultaneous restriction of Gemini 1.5 access for new Colab Enterprise projects sends mixed signals about Google's commitment to developer accessibility versus enterprise focus.

Perplexity Breaks Existing Integrations with API Restructure

Perplexity's decision to deprecate the citations field in favour of search_results on 1 May represents the kind of breaking change that keeps integration teams awake at night. Any application relying on the citations field will simply stop working without immediate code updates.

The new search_results field provides more granular data including titles, URLs, and publication dates, which is genuinely useful. However, the lack of a transition period means developers face an immediate fix-or-break scenario. This is particularly problematic for applications that have built citation formatting or reference management around Perplexity's original structure.

Perplexity has also introduced a reasoning_effort parameter for Sonar Deep Research, allowing users to control computational resources and token consumption. Whilst this provides better cost control, it adds complexity to implementations that previously relied on consistent response patterns.

Worth Watching

Azure AI Foundry's Model Router Automates Chat Model Selection

Azure's new Model Router automatically selects the optimal chat model for each prompt, leveraging GPT-5 models including GPT-Realtime-1.5 and GPT-Audio-1.5. This represents a shift towards AI systems that optimise themselves rather than requiring manual model selection. For developers managing multiple chat applications, this could significantly reduce the overhead of model management whilst improving response quality.

Amazon OpenSearch Service Adds GPU Acceleration

Amazon's OpenSearch 3.5 support brings GPU acceleration for vector indexes and auto-optimisation features. The addition of Graviton4-based instances and semantic highlighting suggests Amazon is positioning OpenSearch as a more capable alternative to traditional search solutions. The cross-account ingestion capability particularly addresses enterprise scenarios where data lives across multiple AWS accounts.

Perplexity Introduces Asynchronous API for Deep Research

The new asynchronous API for Sonar Deep Research includes a 7-day TTL on results, creating operational considerations around data freshness and cleanup processes. This capability enables more complex research queries but requires applications to handle asynchronous workflows and result expiration.

Quick Hits

Azure OpenAI launched Spotlighting for enhanced prompt shield protection against indirect attacks from embedded documents. Replicate added LLM-friendly code snippets with direct ChatGPT and Claude integration links. Hugging Face welcomed Llama Guard 4 and introduced Intel's AutoRound quantization for LLMs and VLMs.

The Week Ahead

Watch for early adoption metrics on Azure OpenAI's Sora implementation as enterprises begin testing video generation capabilities. Google's Vertex AI global endpoint rollout should provide clarity on migration tooling and support resources. The Perplexity API changes will likely surface integration issues across the developer community, potentially forcing additional compatibility updates.

Key dates to monitor: Google's Vertex AI endpoint deprecation timeline becomes clearer as migration documentation emerges, whilst Azure's video generation capabilities face their first enterprise stress tests. Any performance issues or pricing announcements from Microsoft could significantly impact adoption trajectories.

The broader trend is clear: major providers are forcing architectural decisions through capability launches and deprecations simultaneously. Teams need to balance the excitement of new features like video generation against the operational reality of mandatory migrations and breaking API changes.