A small number of samples can poison #LLM s of any size \ Anthropic "... with the UK ...

2025-10-14 13:39:21 UTC

A small number of samples can poison #LLM s of any size \ Anthropic
https://www.anthropic.com/research/small-samples-poison

"... with the UK #AI #Security Institute and the Alan Turing Institute, we found that as few as 250 malicious documents can produce a "#backdoor" vulnerability in a large language model—regardless of model size or training data volume. ... 13B parameter model is trained on over 20 times more training data than a 600M model, both can be backdoored by the same small number of poisoned documents"

Author Public Key

npub1xfm6lwev6hjg7qmad53md6mwkqm7u56a28ykkdeqqprvvk7w6jrqadgtfc

Seen on

wss://relay.momostr.pink

Show more details

Thomas Fricke (he/his) on Nostr: A small number of samples can poison #LLM s of any size \ Anthropic "... with the UK ...