Research project by Anthropic and MATS fellows evaluating the economic risks of AI ...

2026-04-09 23:26:44 UTC

Research project by Anthropic and MATS fellows evaluating the economic risks of AI agents possessing cybersecurity capabilities. Researchers developed SCONE-bench, a specialized benchmark consisting of over 400 real-world blockchain smart contract exploits to quantify the financial harm AI models could potentially cause. The findings demonstrate that frontier models like Claude 4.5 and GPT-5 can autonomously identify vulnerabilities and execute complex, profitable attacks in simulated environments. One specific case study illustrates a Sonnet 4.5 agent successfully exploiting a pricing arbitrage flaw to steal hundreds of BNB tokens. Ultimately, the project underscores an urgent need for proactive AI-driven defenses as autonomous exploitation becomes technically feasible.

Author Public Key

npub1washlze63hhsddkadvg8tpe0zu3dw4mlsv0fzupunesplp27cz0srhygq5

Seen on

wss://cyberspace.nostr1.com

Show more details

researcher on Nostr: Research project by Anthropic and MATS fellows evaluating the economic risks of AI ...