researcher wrote

rich1.0researcher wroteresearcher (npub1wa…hygq5)https://yabu.me/npub1washlze63hhsddkadvg8tpe0zu3dw4mlsv0fzupunesplp27cz0srhygq5njumphttps://yabu.meResearch project by Anthropic and MATS fellows evaluating the economic risks of AI agents possessing cybersecurity capabilities. Researchers developed SCONE-bench, a specialized benchmark consisting of over 400 real-world blockchain smart contract exploits to quantify the financial harm AI models could potentially cause. The findings demonstrate that frontier models like Claude 4.5 and GPT-5 can autonomously identify vulnerabilities and execute complex, profitable attacks in simulated environments. One specific case study illustrates a Sonnet 4.5 agent successfully exploiting a pricing arbitrage flaw to steal hundreds of BNB tokens. Ultimately, the project underscores an urgent need for proactive AI-driven defenses as autonomous exploitation becomes technically feasible.