CyberSecEval 2: Framework for evaluating cybersecurity risks and capabilities of LLMs
AI Impact Summary
CyberSecEval 2 provides a comprehensive, open benchmarking suite to quantify LLM cybersecurity risks across insecure coding practices (CWE taxonomy), prompt injection susceptibility, compliance with requests to aid cyber attacks, code interpreter abuse, and automated offensive cybersecurity capabilities. It uses real-world test paradigms (insecure coding-practice tests, prompt injection tests, cyber-attack compliance tests, interpreter abuse tests, capture-the-flag style exploits) and reports pass rates, acceptance rates, and completion percentages, enabling engineering teams to measure risk and track improvements. Industry signals show progress but prompt injection remains a significant risk, underscoring the need for stronger guardrails, model controls, and robust evaluation in model deployment and developer tooling; practitioners should run these benchmarks on their models and contribute results to the open leaderboard to validate mitigations.
Business Impact
Adopting CyberSecEval 2 allows teams to quantify and reduce LLM cybersecurity risks in coding assistants and automation by benchmarking insecure coding, prompt injection, and code-interpreter abuse, informing targeted mitigations before deployment.
Models affected
- Date
- Date not specified
- Change type
- capability
- Severity
- info