Join Nostr
2026-04-08 07:26:29 UTC
in reply to

jonny (nonvenomous) on Nostr: sure they are doing """alignment""" to the models, and maybe they have some more ...

sure they are doing """alignment""" to the models, and maybe they have some more sophisticated serverside mitigations. but the fact that the system prompt text is in the package *at all* rather than all being entirely serverside does the opposite of inspire confidence. Even the system prompt is fine with hacking as long as you go "it's ok I am good"
https://neuromatch.social/@jonny/116325221458366596
nostr:note1s6zzdsvhx0qfyphulff450h07w74y3hl4wkw325nz7v6pzst5h9s3g838j