Join Nostr
2024-08-15 01:43:44 UTC

AtlantisPleb on Nostr: Episode 121: SWE-bench Planning We make a plan to win high score on the SWE-bench ...

Episode 121: SWE-bench Planning

We make a plan to win high score on the SWE-bench Verified benchmark.

We pull the 500 samples into a web UI for easy inspection -- super smooth thanks to Convex.dev! -- then decide to focus first on the psf/requests repo.

Next we index!

https://stacker.news/items/649106/r/AtlantisPleb