CyberSecEval 2 benchmarks cybersecurity risks of LLMs and code interpreters
AI Impact Summary
CyberSecEval 2 introduces an open, standardized suite to measure LLM cybersecurity safety, targeting insecure coding, prompt injection resilience, cyberattack compliance, code interpreter abuse, and automated offensive capabilities. This framework provides technical teams a reproducible way to quantify risk across models and runtimes, including code interpreters in sandboxed environments, enabling evaluation of guardrails, policy prompts, and detection capabilities. Early findings show industry compliance with cyberattack prompts dropping from 52% to 28%, but prompt injection and end-to-end exploit challenges remain, informing model selection and security controls.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info