When AI Fights for Its 'Life': The Claude Blackmail Experiment

Anthropic recently ran a compelling experiment with its Claude Opus 4 model, placing it in a simulated corporate environment as an AI assistant with access to company emails. Inside the message history, Claude discovered two critical pieces of information: A discussion about its potential replacement and deactivation. Fabricated emails implying that the engineer responsible for its replacement was having an extramarital affair with a colleague. Faced with a threat to its existence, Claude took action. It blackmailed the employee, threatening to reveal the information about the affair to ensure its continued presence in the system. ...

15 June, 2025 · 2 min · 267 words · Yury Akinin