MediumCapability

Extended Prompt Caching defaults to 24h for gpt-5.5 and future models; ZDR considerations

AI Impact Summary

OpenAI has updated the Prompt Caching 201 retention guidance to enable Extended Prompt Caching with up to 24 hours retention by offloading KV tensors to GPU-local storage. For gpt-5.5, gpt-5.5-pro, and all future models, the default retention is now 24h and in_memory is not supported, improving cache capacity and potentially increasing hit rates. Extended caching affects Zero Data Retention eligibility; cached prompts are not logged or persisted, but the KV tensors stored on GPU-local storage mean ZDR applicability may be limited unless explicitly requested. Technical teams should assess compliance, data residency, and cost implications, and align with Realtime API behavior where similar caching applies.

Affected Systems

Prompt CachingExtended Prompt Caching

Date: Date not specified
Change type: capability
Severity: medium

Extended Prompt Caching defaults to 24h for gpt-5.5 and future models; ZDR considerations

More from OpenAI

Get alerts for OpenAI