Together AIQdrantPerplexityHugging FaceAnthropicMistral AIGoogle Gemini / Vertex AIAzure OpenAIAWS BedrockWeaviateOpenSearch (AWS)GroqReplicateMeta (Llama - hosted)ElasticAI21 LabsOpenAI

Together AI's 23-Minute Outage Exposes Infrastructure Fragility: Week of 1 December

1 Dec 2025 – 8 Dec 20255 min read

Together AI's 23-Minute Outage Exposes Infrastructure Fragility: Week of 1 December

Together AI's 23-minute website outage on 5 December wasn't just another blip in the AI provider landscape. It was a stark reminder that even the most promising platforms can stumble when infrastructure meets reality. Whilst the outage itself was brief, the cascade of duplicate incident reports suggests monitoring systems weren't quite up to scratch either.

## What happened during Together AI's December outage?

Together AI's website went dark for 23 minutes on 5 December, creating a perfect storm of user frustration and operational chaos. The incident generated an unprecedented 35 separate critical alerts in our monitoring system, each flagging the same core issue from slightly different angles. This alert storm reveals as much about Together AI's incident response maturity as the outage itself.

The timing couldn't have been worse for a platform positioning itself as enterprise-ready. Website outages don't just block marketing pages; they often signal deeper infrastructure problems that could affect API reliability. Users attempting to access documentation, manage API keys, or troubleshoot existing integrations found themselves locked out entirely. For organisations evaluating Together AI against competitors like OpenAI or Anthropic, this kind of reliability gap raises serious questions about production readiness.

What's particularly concerning is the lack of proactive communication during the incident. Modern AI providers typically push status updates within minutes of detecting issues, but the multiple duplicate signals suggest Together AI's monitoring and alerting systems may need fundamental improvements. This isn't just about website uptime; it's about operational transparency that enterprise customers demand.

## Why Qdrant's data corruption fixes matter more than you think

Qdrant's v1.16.2 release on 4 December addressed some genuinely terrifying bugs that could corrupt user data without warning. We're talking about WAL corruption, consensus snapshot crashes, and storage corruption issues that particularly affected Windows deployments. For vector database users, data corruption isn't just an inconvenience; it's a potential business catastrophe.

The severity of these fixes highlights how quickly vector databases have moved from experimental tools to mission-critical infrastructure. Organisations using Qdrant for production search, recommendation systems, or RAG implementations need to prioritise this update immediately. The consensus crashes alone could cause distributed deployments to fail silently, corrupting data whilst appearing to function normally.

What makes this release particularly significant is its scope across operating systems and storage configurations. The Windows-specific storage corruption issues suggest Qdrant's testing coverage may have gaps, whilst the consensus problems indicate challenges with distributed system reliability. For teams running Qdrant in production, this update represents a critical security and stability milestone that shouldn't be delayed.

## Perplexity forces sonar-reasoning migration with 15 December deadline

Perplexity's deprecation of the sonar-reasoning model, effective 1 December with a hard cutoff on 15 December, represents one of the more aggressive migration timelines we've seen this year. Users have exactly two weeks to migrate to sonar-reasoning-pro or face complete service disruption. This isn't a gentle sunset; it's a forced march.

The migration path appears straightforward on paper, but the tight timeline creates operational pressure for teams with complex integrations. Applications using sonar-reasoning for production workflows need immediate attention, particularly those with automated systems that might not gracefully handle model unavailability. The new sonar-reasoning-pro model promises enhanced capabilities, but any performance or output differences could require additional testing and validation.

Perplexity's approach here signals growing confidence in their premium model tier, but it also demonstrates how quickly AI providers are willing to discontinue older models. For organisations building on Perplexity's platform, this deprecation cycle offers a preview of future model lifecycle management. The accompanying Search API enhancements, including new token controls and filtering options, suggest Perplexity is focusing on more sophisticated enterprise use cases.

Worth Watching

OpenAI's Swift client for Hugging Face Hub addresses a genuine pain point for iOS developers working with open-source models. The new swift-huggingface client promises faster downloads, resume support, and OAuth 2.0 authentication. For Swift developers who've struggled with Python-based model access, this could significantly streamline development workflows. The Python-compatible cache is particularly clever, allowing seamless integration with existing toolchains.

Quick Hits

Qdrant's telemetry performance improvements should reduce monitoring overhead for large deployments
Together AI's incident response generated 35 duplicate alerts, suggesting monitoring system improvements needed
Perplexity's Search API token controls provide more granular usage management for enterprise customers
Swift developers can now access Hugging Face models without Python dependencies

## What's coming up in the week ahead?

15 December marks Perplexity's hard deadline for sonar-reasoning migration. Applications still using the deprecated model will stop functioning entirely after this date. Teams should complete migration testing this week to avoid last-minute scrambles.

Mid-December typically sees AI providers pushing final updates before holiday freezes. Expect maintenance windows and potential service disruptions as platforms prepare for reduced staffing periods. Monitor status pages closely for scheduled maintenance announcements.

Year-end planning should include reviewing all AI provider deprecation notices and model lifecycle policies. The aggressive timeline on Perplexity's sonar-reasoning deprecation suggests 2025 may bring more rapid model retirement cycles across the industry.

The week's events underscore a maturing AI provider landscape where infrastructure reliability and operational excellence matter as much as model capabilities. Together AI's outage, Qdrant's critical fixes, and Perplexity's forced migration all highlight the operational complexities of building on AI platforms. As the industry moves beyond the experimental phase, providers who can't deliver enterprise-grade reliability will find themselves increasingly marginalised.