Join Nostr
2026-04-10 17:24:53 UTC

:seven: on Nostr: I may have just made the stupidest text compression algorithm on the planet and I ...

I may have just made the stupidest text compression algorithm on the planet and I love it.

step 1: have an llm compress the original text into caveman-speak: https://github.com/wilpel/caveman-compression/blob/main/prompts/compression.txt
step 2: perform RePairing on the document

if you want to completely decompress you

step 3: decompose the flattened repair hierarchy into their places in the compressed string of ids
step 4: use the LLM to decompress caveman-speak to standard prose: https://github.com/wilpel/caveman-compression/blob/main/prompts/decompression.txt

I thought this was just a stupid fucking funny idea that I had but I kept fucking playing with it and it kept fucking working for an average of 40% compression, and a Levenshtein-distance loss between the original text and the resulting decompressed text consistently in the 98-99% similarity range. Like what the fuck man this is a fucking meme stop giving me such good results