OpenAI Launches Bug Bounty Program to Test Claude 3.7 Safety Defenses
Action Required
OpenAI is proactively seeking external validation of its safety measures, reducing potential risks associated with future AI model deployments.
AI Impact Summary
This announcement details a new bug bounty program focused on testing the safety defenses of Claude 3.7 Sonnet, specifically targeting universal jailbreaks in Constitutional Classifiers. The program is a proactive measure to validate the effectiveness of ASL-3 safeguards, which are crucial for mitigating risks associated with increasingly capable AI models, particularly those related to CBRN weapons. This initiative represents a key step in our Responsible Scaling Policy and demonstrates our commitment to responsible AI development.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- high