Azure OpenAICohereGoogle Gemini / Vertex AIReplicateMistral AIQdrantLM Studio

Azure OpenAI Unleashes GPT-4.5 and Video Generation: Week of 27 January 2025

27 Jan 2025 – 3 Feb 20255 min read

Azure OpenAI Unleashes GPT-4.5 and Video Generation: Week of 27 January 2025

Microsoft has unleashed the biggest Azure OpenAI capability drop we've seen in months, rolling out GPT-4.5 preview, Sora video generation, and the o3-mini model all within days. Meanwhile, other providers are forcing migrations that could break existing integrations.

The Big Moves

Microsoft Goes All-In on Multimodal AI

Azure OpenAI has become the clear leader in enterprise AI deployment this week with a coordinated release spanning text, image, audio, and video capabilities. The headline addition is GPT-4.5 preview, marking OpenAI's next major model iteration now available through Azure's enterprise-grade infrastructure.

More immediately impactful is the rollout of Sora video generation capabilities, including both image-to-video and video-to-video processing. This puts professional-grade AI video creation directly into enterprise hands through Azure's compliance and security framework. Early access customers are already testing these capabilities, with broader availability expected in the coming weeks.

The technical implementation is particularly noteworthy. Microsoft has integrated these new capabilities into existing Azure OpenAI endpoints, minimising integration overhead for existing customers. The o3-mini model adds reasoning capabilities optimised for cost-conscious deployments, whilst improvements to the gpt-4o-mini-transcribe model deliver enhanced multilingual accuracy that addresses a key enterprise pain point.

For organisations already invested in Azure OpenAI, this represents a significant capability expansion without infrastructure changes. However, the rapid pace of releases suggests Microsoft is pushing hard to maintain competitive advantage, which could mean more frequent updates and potential compatibility considerations ahead.

Cohere Forces Classification Migration by 31 January

Cohere is deprecating its default Classify endpoint using Embed models, effective 31 January 2025. This isn't a gentle sunset - it's a hard cutoff that will break existing integrations unless organisations migrate to fine-tuned Embed models.

The business logic is sound: Cohere wants to push users towards more performant, customised classification approaches rather than supporting generic endpoints. Fine-tuned models deliver better accuracy and align with Cohere's strategy of retiring older, less optimised offerings.

However, the timeline is aggressive. Organisations using the default Classify endpoint have just days to implement alternative approaches. The migration path involves either fine-tuning an Embed model for classification tasks or switching to Cohere's newer classification offerings. This requires both technical implementation and model retraining, which many teams won't complete by the deadline.

The timing suggests Cohere is prioritising innovation velocity over backward compatibility, a pattern we're seeing across multiple providers as the AI market intensifies. Organisations should expect similar forced migrations from other providers in 2025.

Google Clears House in Vertex AI Model Garden

Google is deprecating Mistral Large and Codestral models in Vertex AI Model Garden, effective 30 January 2025. This continues Google's pattern of aggressively curating its model offerings, removing third-party models that don't meet usage thresholds or strategic priorities.

The deprecation impacts developers who've built applications around these specific Mistral offerings through Vertex AI. Unlike direct Mistral API usage, Model Garden provided integrated billing, monitoring, and compliance features that many enterprise teams rely on. The removal forces a choice: migrate to alternative models within Vertex AI or move to direct Mistral API integration.

Google's simultaneous release of Imagen 3 with prompt enhancement capabilities suggests the company is focusing resources on its own model development rather than maintaining broad third-party model access. The new Imagen 3 model includes LLM-based prompt rewriting, potentially reducing the manual prompt engineering that's been a barrier to image generation adoption.

This pattern of model garden curation will likely continue, with Google prioritising its own models and high-performing third-party offerings whilst removing niche or underperforming options.

Worth Watching

Azure OpenAI Expands Audio Capabilities

Microsoft released two new GPT-4o mini audio models optimised for different use cases: one for audio completions and another for real-time interactions. This represents a more nuanced approach to audio AI, acknowledging that batch processing and real-time applications have different performance requirements. The stored completions feature for chat history also enables better model evaluation and fine-tuning workflows.

Replicate Introduces Official Models Programme

Replicate's new 'Official Models' category promises predictable pricing and simplified API calls for high-quality models maintained with their creators. This addresses a key enterprise concern around model reliability and pricing transparency. The programme starts with models like flux-1.1-pro and could signal Replicate's push towards more enterprise-focused offerings.

Mistral AI Releases Structured Outputs

Mistral AI's custom structured outputs feature across all models reduces integration complexity by eliminating custom parsing logic. This capability enhancement makes Mistral models more attractive for production applications where consistent data formatting is critical. Combined with the release of Mistral Small 3, it shows continued investment in developer experience.

Quick Hits

Qdrant improved GPU compatibility with half-float fallback and fixed blob store stability issues
LM Studio added support for DeepSeek R1 open source reasoning model
Replicate enhanced platform experience with predictions on deployment pages and better logged-out user flows
Google launched new monitoring dashboard for Vertex AI foundation models with GA dedicated endpoints

The Week Ahead

The 31 January deadline for Cohere's Classify endpoint deprecation will likely cause disruption for unprepared organisations. Watch for emergency migrations and potential service outages as teams scramble to implement alternatives.

Microsoft's aggressive Azure OpenAI expansion suggests more capability announcements are coming, particularly around enterprise features and compliance certifications. The rapid release pace indicates they're responding to competitive pressure, likely from Google's Vertex AI improvements and Anthropic's enterprise push.

Google's model garden curation will continue, with more third-party model removals expected as they focus resources on strategic partnerships and their own model development. Organisations using Vertex AI Model Garden should audit their dependencies and prepare migration plans for non-strategic models.

The pattern emerging across providers is clear: faster innovation cycles, more aggressive deprecations, and stronger focus on enterprise-grade capabilities. 2025 is shaping up to be the year of forced modernisation in AI infrastructure.