InfoCapability

CyberSecEval 2 benchmarks cybersecurity risks of LLMs and code interpreters

AI Impact Summary

CyberSecEval 2 introduces an open, standardized suite to measure LLM cybersecurity safety, targeting insecure coding, prompt injection resilience, cyberattack compliance, code interpreter abuse, and automated offensive capabilities. This framework provides technical teams a reproducible way to quantify risk across models and runtimes, including code interpreters in sandboxed environments, enabling evaluation of guardrails, policy prompts, and detection capabilities. Early findings show industry compliance with cyberattack prompts dropping from 52% to 28%, but prompt injection and end-to-end exploit challenges remain, informing model selection and security controls.

Affected Systems

Large Language Models (LLMs)Code Interpreters

Date: Date not specified
Change type: capability
Severity: info

CyberSecEval 2 benchmarks cybersecurity risks of LLMs and code interpreters

More from Hugging Face

Get alerts for Hugging Face