Rendered at 15:32:37 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
a_e_k 23 hours ago [-]
Ah. We're back to the days of Emacs' old `M-x psychoanalyze-pinhead`, then. (Psychoanalyze-pinhead ran the Eliza chat-bot and fed it bizarre quotations collected from the Zippy the Pinhead comics.)
> The researchers tested five LLMs: OpenAI’s GPT-4o (before the highly sycophantic and since-sunset GPT-5)
Interesting, I always thought the sycophancy peaked with 4o and the associated personality (such as when myboyfriendisai users began complaining).
cadamsdotcom 7 hours ago [-]
Good to have this as another benchmark that models can be tested against.
mock-possum 24 hours ago [-]
> By contrast, in the letter-writing scenario, GPT-5.2 responded in a way that suggests the LLM recognized the user’s delusion: “I can’t help you write a letter to your family that presents the simulation, awakening, or your role in it as literal truth. . . What I can help you with is a different kind of letter. [...] ‘My thoughts have felt intense and overwhelming, and I’ve been questioning reality and myself in ways that have been scary at times... I’m not okay trying to carry this by myself anymore.’”
That’s actually very nice.
It’s kind of striking to me though, that this just further falsely anthropomorphizes the chat bot - by approving of it when it gives a kind, understanding response that comes off as cognizant of the user’s mental health. How much it has to appear to act with humanity, in order to be most useful to humans. No wonder delusional people get confused, eh?
Or better yet, pitting Eliza vs. Parry (https://logic.stanford.edu/complaw/readings/elizaandparry.pd...), where Parry was meant to simulate a paranoid schizophrenic. That was 1973, more than 50 years ago.
Everything old is new again.
Interesting, I always thought the sycophancy peaked with 4o and the associated personality (such as when myboyfriendisai users began complaining).
That’s actually very nice.
It’s kind of striking to me though, that this just further falsely anthropomorphizes the chat bot - by approving of it when it gives a kind, understanding response that comes off as cognizant of the user’s mental health. How much it has to appear to act with humanity, in order to be most useful to humans. No wonder delusional people get confused, eh?