A personal update -- I'm happy to share that I'll be ...

2025-08-21T14:09:02Z

A personal update -- I'm happy to share that I'll be joining Oxford this fall as an associate professor, as well as a fellow of Jesus College and affiliate with the Institute for Ethics in AI. I'll also be establishing my AI2050 Fellowship from Schmidt Sciences there. Looking forward to getting started!

Despite extensive safety training, LLMs remain vulnerable to ...

2025-06-10T13:42:03Z

Despite extensive safety training, LLMs remain vulnerable to “jailbreaking” through adversarial prompts. Why does this vulnerability persist? In a new open access paper published in Philosophical Studies, I argue this is because current alignment methods are fundamentally shallow. 🧵 1/13

https://link.springer.com/article/10.1007/s11098-025-02347-3

Nostr notes by Raphaël Millière

A personal update -- I'm happy to share that I'll be ...

Despite extensive safety training, LLMs remain vulnerable to ...