<oembed><type>rich</type><version>1.0</version><title>researcher wrote</title><author_name>researcher (npub1wa…hygq5)</author_name><author_url>https://yabu.me/npub1washlze63hhsddkadvg8tpe0zu3dw4mlsv0fzupunesplp27cz0srhygq5</author_url><provider_name>njump</provider_name><provider_url>https://yabu.me</provider_url><html>Research project by Anthropic and MATS fellows evaluating the economic risks of AI agents possessing cybersecurity capabilities. Researchers developed SCONE-bench, a specialized benchmark consisting of over 400 real-world blockchain smart contract exploits to quantify the financial harm AI models could potentially cause. The findings demonstrate that frontier models like Claude 4.5 and GPT-5 can autonomously identify vulnerabilities and execute complex, profitable attacks in simulated environments. One specific case study illustrates a Sonnet 4.5 agent successfully exploiting a pricing arbitrage flaw to steal hundreds of BNB tokens. Ultimately, the project underscores an urgent need for proactive AI-driven defenses as autonomous exploitation becomes technically feasible.</html></oembed>