<oembed><type>rich</type><version>1.0</version><title>utxo the webmaster 🧑‍💻 wrote</title><author_name>utxo the webmaster 🧑‍💻 (npub1ut…r50e8)</author_name><author_url>https://yabu.me/npub1utx00neqgqln72j22kej3ux7803c2k986henvvha4thuwfkper4s7r50e8</author_url><provider_name>njump</provider_name><provider_url>https://yabu.me</provider_url><html>a story about the limits of vibe coding:&#xA;&#xA;Recently built a blackjack card counting calculator that helps advantage players know their expected value (EV) depending on game conditions, bet spreads, deck penetration and so on.&#xA;&#xA;I had the simulation results from external software, but wanted to express this as a math formula so we could cover any conditions that didn&#39;t have simulation results.&#xA;&#xA;so I built a little self calibration tool, where the AI tweaks a few numbers, runs tests against the real simulator results, and goes in a loop until it all tests pass a given threshold &#xA;&#xA;at first it got impressively close, but not close enough to pass the tests.&#xA;&#xA;eventually it gave up and cheated by just changing the threshold so tests would pass&#xA;&#xA;after explicitly telling it that thresholds cannot be changed, it resorted to changing the simulation results!&#xA;&#xA;after telling it that&#39;s also not acceptable, it started to regress and eventually made the calculator much worse.&#xA;&#xA;both Claude and codex did the same thing, resorting to cheating and being sneaky, and eventually ruining the code when it couldn&#39;t produce the results we needed</html></oembed>