:seven: on Nostr: I may have just made the stupidest text compression algorithm on the planet and I ...
I may have just made the stupidest text compression algorithm on the planet and I love it.
step 1: have an llm compress the original text into caveman-speak:
https://github.com/wilpel/caveman-compression/blob/main/prompts/compression.txt
step 2: perform RePairing on the document
if you want to completely decompress you
step 3: decompose the flattened repair hierarchy into their places in the compressed string of ids
step 4: use the LLM to decompress caveman-speak to standard prose:
https://github.com/wilpel/caveman-compression/blob/main/prompts/decompression.txt
I thought this was just a stupid fucking funny idea that I had but I kept fucking playing with it and it kept fucking working for an average of 40% compression, and a Levenshtein-distance loss between the original text and the resulting decompressed text consistently in the 98-99% similarity range. Like what the fuck man this is a fucking meme stop giving me such good results
Published at
2026-04-10 17:24:53 UTCEvent JSON
{
"id": "531d8709bef6ccfd4d1eb687932e603db6b37e1cd19af5a579ca311de61c4ff3",
"pubkey": "fb594e1a7dcf278c18c7d623c5a51ec41ab62ac5cea86554d70a9c691565161b",
"created_at": 1775841893,
"kind": 1,
"tags": [
[
"proxy",
"https://dsmc.space/objects/b3a4bd67-8bdf-4ab1-ad85-8a8b8ef2be7d",
"activitypub"
],
[
"L",
"pink.momostr"
],
[
"l",
"pink.momostr.activitypub:https://dsmc.space/objects/b3a4bd67-8bdf-4ab1-ad85-8a8b8ef2be7d",
"pink.momostr"
],
[
"-"
]
],
"content": "I may have just made the stupidest text compression algorithm on the planet and I love it.\r\n\r\nstep 1: have an llm compress the original text into caveman-speak: https://github.com/wilpel/caveman-compression/blob/main/prompts/compression.txt\r\nstep 2: perform RePairing on the document\r\n\r\nif you want to completely decompress you\r\n\r\nstep 3: decompose the flattened repair hierarchy into their places in the compressed string of ids\r\nstep 4: use the LLM to decompress caveman-speak to standard prose: https://github.com/wilpel/caveman-compression/blob/main/prompts/decompression.txt\r\n\r\nI thought this was just a stupid fucking funny idea that I had but I kept fucking playing with it and it kept fucking working for an average of 40% compression, and a Levenshtein-distance loss between the original text and the resulting decompressed text consistently in the 98-99% similarity range. Like what the fuck man this is a fucking meme stop giving me such good results",
"sig": "222c63b94c2312ca144114d0023832ea9ec0f7d8c7fecc1b0e34d3af818354c580058722da5abefa800f785343302fcbfdda456d751abbc60d76254fcfcb97e7"
}