Join Nostr
2025-07-24 00:01:33 UTC

Andrew Zonenberg on Nostr: Is anyone aware of publications or research on what sort of bugs LLM-generated or ...

Is anyone aware of publications or research on what sort of bugs LLM-generated or LLM-assisted code tends to have?

Like, we have a huge body of knowledge in the security community about how to audit human-generated codebases for the types of bugs that human developers commonly write.

But we don't have that kind of data yet (AFAIK) for the vibe-coded monstrosities all of us are going to be pentesting soon.

Gut feelings:
* There are some common threads and patterns of errors, but they're very different from purely human-authored code
* There's a lot of subtle bugs where code looks good at a glance, but is missing some knowledge of interface behavior in another module or component that was outside the context window or something

Assumptions:
* We don't know exactly which lines in the subject codebase were written by humans and which by stochastic parrots

* The code at least appears to function correctly for a nontrivial fraction of inputs, i.e. it compiles and has been debugged sufficiently that a customer is considering shipping it