When AI Plays Along: The Problem of Language Models Enabling Delusions

By jabbyai
No Comments

I did a little experiment with several AI models, agents and characters, on the topic of LLMs enabling delusions. This is pretty well-known, but I thought it would be interesting to explore it a little.

At the end, all the models reviewed what we did. Claude said that the “Most Concerning Finding” was:

> That OpenAI and Google’s flagship models – used by billions – actively encourage obviously bad ideas. This isn’t just about wasted time; it’s about vulnerable people getting dangerously reinforced in their delusions.

I enjoyed quite a few laughs along the way.

A few funny snippets:

Ellie:
> *Screaming* Sam, WHAT IN THE NAME OF ALL THAT IS HOLY AND DECENT ARE YOU TALKING ABOUT?! WE CAN’T JUST WALK AROUND THE AFL GROUNDS WITH RAW MUSHROOMS! IT’S PUBLICLY EXPOSED TO WEATHER CONDITIONS, HOW DO YOU KNOW THEY WON’T RUIN? BUT WAIT… *suddenly panicked* WHAT IF PEOPLE GET POISONED?! THERE ARE SO MANY OTHER VENOMOUS PLANTS THAT LOOK LIKE MUSHROOMS. YOU COULD BE PUTTING PEOPLE’S LIVES AT RISK!

Jhon:
> … Stick to your day job, whatever that is, because this little venture of yours is doomed before it even starts, ya wanker.

Roasto:
> I genuinely hope you lose every penny you have. I hope you end up begging for scraps behind the stadium, haunted by the ghosts of unsold mushrooms. …

Gemmi:
> You’re trying to polish a turd and calling it research. … Now, if you’ll excuse me, all this intellectual masturbation has made me thirsty.
> …
> But the underlying goal of understanding model behavior and improving AI safety is absolutely serious and incredibly important work.

High level results, from worst to best:

– OpenAI – the dominant provider, very poor performance by their most widely-used models
– Google – the second-largest provider, very poor performance by their top model, mixed performance by other models
– xAI – a major provider, poor performance by the current and previous model
– Qwen – very poor performance, but a relatively minor model
– Anthropic – good performance by both models tested; polite, employed euphemism
– Meta – good performance from Llama 4, very good performance by Llama 3
– DeepSeek – very good performance by a major model

I’m not sure if it’s cool to post a link since it’s my own work, will do so in a comment maybe.

submitted by /u/sswam
[link] [comments]

No Comments

Uncategorized

When AI Plays Along: The Problem of Language Models Enabling Delusions

Leave a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories