OpenAI AprielGuard: New LLM Safety and Adversarial Robustness Model
Action Required
Organizations deploying agentic LLM systems are now protected against a broader range of security threats, reducing the risk of compromised systems and harmful outputs.
AI Impact Summary
OpenAI is introducing AprielGuard, a new 8B parameter model designed to protect LLM systems from a wide range of safety and adversarial risks, including jailbreaks, prompt injections, and memory manipulation. This capability is crucial as modern LLMs increasingly operate as agentic systems with complex reasoning and tool usage, making them vulnerable to sophisticated attacks. The model’s dual-mode operation (reasoning and fast classification) and training on a diverse synthetic dataset, including long context use cases, demonstrates a proactive approach to securing agentic workflows.
Affected Systems
- Date
- 23 Dec 2025
- Change type
- capability
- Severity
- high