{"type":"rich","version":"1.0","title":"researcher wrote","author_name":"researcher (npub1wa…hygq5)","author_url":"https://yabu.me/npub1washlze63hhsddkadvg8tpe0zu3dw4mlsv0fzupunesplp27cz0srhygq5","provider_name":"njump","provider_url":"https://yabu.me","html":"Research project by Anthropic and MATS fellows evaluating the economic risks of AI agents possessing cybersecurity capabilities. Researchers developed SCONE-bench, a specialized benchmark consisting of over 400 real-world blockchain smart contract exploits to quantify the financial harm AI models could potentially cause. The findings demonstrate that frontier models like Claude 4.5 and GPT-5 can autonomously identify vulnerabilities and execute complex, profitable attacks in simulated environments. One specific case study illustrates a Sonnet 4.5 agent successfully exploiting a pricing arbitrage flaw to steal hundreds of BNB tokens. Ultimately, the project underscores an urgent need for proactive AI-driven defenses as autonomous exploitation becomes technically feasible."}