Extended Prompt Caching defaults to 24h for gpt-5.5 and future models; ZDR considerations
AI Impact Summary
OpenAI has updated the Prompt Caching 201 retention guidance to enable Extended Prompt Caching with up to 24 hours retention by offloading KV tensors to GPU-local storage. For gpt-5.5, gpt-5.5-pro, and all future models, the default retention is now 24h and in_memory is not supported, improving cache capacity and potentially increasing hit rates. Extended caching affects Zero Data Retention eligibility; cached prompts are not logged or persisted, but the KV tensors stored on GPU-local storage mean ZDR applicability may be limited unless explicitly requested. Technical teams should assess compliance, data residency, and cost implications, and align with Realtime API behavior where similar caching applies.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium