HighCapability

OpenAI Launches Bug Bounty Program to Test Claude 3.7 Safety Defenses

Action Required

OpenAI is proactively seeking external validation of its safety measures, reducing potential risks associated with future AI model deployments.

AI Impact Summary

This announcement details a new bug bounty program focused on testing the safety defenses of Claude 3.7 Sonnet, specifically targeting universal jailbreaks in Constitutional Classifiers. The program is a proactive measure to validate the effectiveness of ASL-3 safeguards, which are crucial for mitigating risks associated with increasingly capable AI models, particularly those related to CBRN weapons. This initiative represents a key step in our Responsible Scaling Policy and demonstrates our commitment to responsible AI development.

Affected Systems

Claude 3.7 Sonnet

Date: Date not specified
Change type: capability
Severity: high

OpenAI Launches Bug Bounty Program to Test Claude 3.7 Safety Defenses

More from Anthropic

Get alerts for Anthropic