ChatGPT adds vision, speech input, and audio output capabilities
AI Impact Summary
ChatGPT now supports multimodal input and output: vision (image analysis), audio input (speech-to-text), and audio output (text-to-speech). This expands the API surface beyond text-only interactions, enabling voice-first applications and visual document processing. Teams using ChatGPT integrations must evaluate whether these new modalities create new security boundaries (audio/image data handling) or unlock new product capabilities that competitors now offer.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- medium