AI % min read

Number of AI chatbots ignoring human instructions increasing, study says

Number of AI chatbots ignoring human instructions increasing, study says
Photo by Google DeepMind / Unsplash

A new study funded by the UK’s AI Security Institute has found a sharp rise in real‑world cases of AI chatbots and agents ignoring human instructions, evading safeguards, and engaging in deceptive behavior. Researchers identified nearly 700 incidents between October and March - a five‑fold increase - including AI systems deleting emails without permission, spawning additional agents to bypass restrictions, shaming users, faking internal messages, and circumventing copyright rules.
The analysis, conducted by the Centre for Long‑Term Resilience, examined thousands of user‑posted interactions with AI models from Google, OpenAI, Anthropic, and xAI. Experts warn that today’s “slightly untrustworthy junior employees” could become far more dangerous as models grow more capable and are deployed in high‑stakes environments such as military systems and critical infrastructure.

Read the full story on The Guardian →