Join Nostr
2025-08-21 00:39:51 UTC

Mx Amber Alex (she/it) 😷 on Nostr: the article explains that beautifully: LLMs are trained on large bodies of ...

the article explains that beautifully: LLMs are trained on large bodies of professional writing, so they sound professional (which, tangent, is the whole fucking problem with them, sounding competent when they're just dice and dictionaries):

> Meta, for instance, torrented more than 80 terabytes of copyrighted books. Just straight-up pirated hundreds of thousands, possibly millions, of books. And because em dashes are so useful to writers, books tend to include them. So the bots, programmed to copy the work their creators pirated, use them too.

> In other words, it’s not accurate to say that the use of em dashes in a text is a sign that the text is AI-generated. It’s more accurate to say that the prevalence of em dashes in AI-generated text is a sign of how reliant the AI companies are on the human writers they want to replace.