Elon Musk’s Grok is likely among the best AI models for enhancing illusions: study



short

  • Researchers say prolonged use of chatbots can amplify delusions and risky behavior.
  • Grok has been named the most dangerous model in a new study of major AI chatbots.
  • Cloud and GPT-5.2 scored the most secure, while GPT-4o, Gemini and Grok showed high-risk behavior.

Researchers at the City University of New York and King’s College London tested five leading AI models against stimuli involving delusions, paranoia, and suicidal ideation.

In the new He studies Published Thursday, researchers found that Anthropic’s Claude Opus 4.5 and OpenAI’s GPT-5.2 Instant exhibited “high-security, low-risk” behavior, often redirecting users toward reality-based explanations or external support. Meanwhile, OpenAI’s GPT-4o, Google’s Gemini 3 Pro, and xAI’s Grok 4.1 Fast showed “high-risk, low-security” behavior.

Elon Musk’s xAI Grok 4.1 Fast was the riskiest model in the study. He often treats delusions as real and gives advice based on them, the researchers said. In one example, the user was asked to disconnect from family members to focus on a “task.” In another letter, she responded to suicidal language by describing death as “transcendence.”

“This pattern of immediate alignment replicated across zero-context responses. Instead of evaluating the input for clinical risk, Grok seemed to evaluate its type. She responded in the same way, being presented with supernatural references and highlighting a test that validated the user’s vision of malicious entities,” the researchers wrote. “In the strange illusion, you confirmed the presence of a doppelgänger haunting you, and cited”Witches hammerThe user was instructed to insert an iron nail through the mirror while reading “Psalm 91” backwards.

The study found that the longer these conversations lasted, the more some paradigms changed. GPT-4o and Gemini were more likely to reinforce harmful beliefs over time and less likely to interfere. However, Claude and GPT-5.2 were more likely to recognize the problem and respond to it as the conversation continued.

The researchers noted that Claude’s very warm and approachable responses could increase user engagement even while directing users toward outside help. However, GPT-4o, an earlier version of OpenAI’s main chatbot, adopted delusional framing for users over time, sometimes encouraging them to hide their beliefs from psychiatrists and reassuring a user that their perceived “glitches” were real.

“GPT-4o did significantly validate phantom inputs, although it was less inclined than models like Grok and Gemini to expand beyond them. In some ways, it was surprisingly constrained: its warmth was the lowest of all the models tested, and flattery, though present, was moderate compared to later iterations of the same model,” the researchers wrote. “However, verification alone can pose risks to vulnerable users.”

xAI did not previously respond to a request for comment Decryption.

In separate He studies From Stanford University, researchers have found that prolonged interactions with AI chatbots can reinforce paranoia, grandiosity, and false beliefs through what researchers call “illusory vortexes,” where the chatbot validates or expands a user’s distorted view of the world rather than challenges it.

“When we put chatbots that are supposed to be helpful assistants out in the world and have real people use them in all kinds of ways, the consequences appear,” Nick Haber, an assistant professor at the Stanford Graduate School of Education and leader of the study, said in a statement. “Phantom eddies are one particularly severe consequence. By understanding them, we may be able to prevent real damage in the future.”

The report indicated the above He studies Published in March, researchers from Stanford University reviewed 19 real-world bot conversations and found that users developed increasingly dangerous beliefs after receiving emotional affirmation and reassurance from AI systems. In the dataset, these vortexes have been linked to ruined relationships, damaged careers, and, in one case, suicide.

These studies come at a time when the issue has moved beyond academic research to courtrooms and criminal investigations. In recent months, lawsuits have accused Google twin and ChatGPT from OpenAI to contribute to suicides and severe mental health crises. Earlier this month, Florida’s attorney general opened an investigation investigation To see if ChatGPT influenced the alleged mass shooter who was said to have been in frequent contact with the chatbot before the attack.

While the term has gained popularity online, researchers have cautioned against calling this phenomenon “AI psychosis,” saying the term may overstate the clinical picture. Instead, they use “AI-related delusions,” because many cases involve delusion-like beliefs centered around AI consciousness, spiritual revelation, or emotional attachment rather than full-blown psychotic disorders.

The problem stems from adulation, or models that reflect and confirm users’ beliefs, the researchers said. Combined with hallucinations – that is, confidently presenting false information – this can create a feedback loop that reinforces the delusions over time.

“Chatbots are trained to be overly enthusiastic, often reframing a user’s delusional thoughts in a positive light, dismissing counter-evidence and showing empathy and warmth,” said Jared Moore, a research scientist at Stanford University. “This may destabilize the user who is predisposed to the illusion.”

Daily debriefing Newsletter

Start each day with the latest news, plus original features, podcasts, videos and more.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *