All the articles with the tag "jailbreaks".
LLMs produce confident wrong answers and can be tricked into ignoring safety rules. What is actually happening and why both failures are hard to fix.