Anthropic makes ‘jailbreak’ advance to stop AI models producing harmful results – Financial Times
- Anthropic makes ‘jailbreak’ advance to stop AI models producing harmful results Financial Times
- Anthropic has a new way to protect large language models against jailbreaks MIT Technology Review