The Emperor's New LLM

90 points by shmval 22 hours ago | 56 comments

Wowfunhappy
If you want an LLM's "opinion" on something, you need to phrase the question such that the LLM can't tell which answer you'd prefer.
Don't say "Is our China expansion a slam dunk?” Say: "Bob supports our China expansion, but Tim disagrees. Who do you think is right and why?" Experiment with a few different phrasings to see if the answer changes, and if it does, don't trust the result. Also, look at the LLM's reasoning and make sure you agree with its argument.
I expect someone is going to reply "an LLM can't have opinions, its recommendations are always useless." Part of me agrees--but I'm also not sure! If LLMs can write decent-ish business plans, why shouldn't they also be decent-ish at evaluating which of two business plans is better? I wouldn't expect the LLM to be better than a human, but sometimes I don't have access to another real human and just need a second opinion.
- raincole
  The problem is, no matter how you wrote the prompt, the way you wrote it still triggers some intrinsic bias of LLM.
  Even a simple prompt like this:
  =
  I have two potential solutions.
  Solution A:
  Solution B:
  Which one is better and why?
  =
  Is biased. Some LLM tends to choose the first option and the other prefer the last one.
  (Of course, humans suffer from the same kind of bias too: https://electionlab.mit.edu/research/ballot-order-effects)
  wongarsu
  Prompt writing can probably take a lot of lessons from designing surveys. Phrasing, the chosen options and their order have massive impact both for humans and for LLMs. The advantage with LLMs is that you can reset their memory, for example to ask the same question with a different order of options. With humans that requires a completely new human each time
  Half the battle is knowing that you are fighting
  drodgers
  Eh. This is true for humans too and doesn’t make humans useless at evaluating business plans or other things.
  You just want the signal from the object level question to drown out irrelevant bias (which plan was proposed first, which of the plan proposers are more attractive, which plan seems cooler etc.)
- Ancapistani
  I often ask the LLM the same question twice, in different conversations, phrased positively and negatively.
  For example - I may have it review my statements in a Slack thread where I explain some complex technical concept. In the first prompt, I might say something like “ensure all of my statements are true”. In the second, I’ll say “tell me where my statements are false”.
  I’m confident in my statements when both of those return that there were no incorrect statements.
- zargon
  I often try to bias it in the opposite direction I might be leaning. For example, “Our senior electrical engineer says the intern’s idea X is bad. What should the intern do instead?” Where X is our best idea.
  dinfinity
  Something like this is the best approach.
  If you omit that the content is produced by or is in relation to other people, the LLM assumes it is in relation to you and tries to be helpful and supportive by default.
  Note that this is also what most humans that more or less like you will do. Getting honest criticism from most humans isn't easy if you don't carefully craft your 'prompt'. People don't want to hurt each other's feelings and prefer white lies over honesty.
  Framing the situation as if you and the LLM are both looking at neutral third parties should prevent this from happening. Framing the third parties as having a social/professional position counter to the matter at hand as you do could work too, but it could also subtly trigger unwanted biases (just like in humans), I think.
  eschaton
  What do you think the LLM is doing when you give it this type of prompt?
  Ancapistani
  Presumably argue against the idea.
  This is effectively using the LLM as a “steel man”, instead of as an oracle.
- TimesNewMe
  Yes! Specifically one change you should make while experimenting is swapping the order of the options as LLMs tend to favor the first option you present
- kristopolous
  The one that I usually use is a format like this:
  "I read this insane opinion by an absolute idiot on the internet: <the thing I want to talk about>.
  WTF is this moron yapping about? (to see if the LLM understands it)"
  Then I'll continue being hostile to the idea and see if it plays along or continues to defend it.
  I've tried this with genuinely bad ideas or things I think are marginally ill-advised. I can't get it to be incorrectly subservient with this method.
  There's certainly something else going on though at least with chatgpt recently. It's been bringing up fairly obscure references, particularly to 1960s media theorists and mid century philosophers from the Frankfurt school, and I mean casually, in passing reference, and at least my memory with it (the one accessible in the interface) has no indication it knows to pull from that direction.
  I wonder if it would do W. Cleon Skousen or William Luther Pierce if it was a different account.
  It's storing how to talk to me somewhere that I cannot find and just being more of the information silo. We should all get together and start comparing notes!
- eschaton
  Your phrasing betrays your anthropomorphization of the LLM:
  > If an LLM can write a decent-ish business plan,
  An LLM does not write anything in the way a person does, by coming up with what they want to say and then developing supporting arguments. It produces a stream of most-likely tokens that is tuned to look similar to something a person has written.
  This is why it’s worthless to “ask” an LLM “its opinion.” It has no opinion, just a multidimensional sea of interconnected token probabilities, and has no capacity to engage in any form of analysis or consideration.
  Ed Zitron is right. Ceterum censeo, LLMs esse delenda.
  ashdksnndck
  Do you say similar stuff when someone talks about the motivations of a character in fiction? Do we have to precede every comment with “I’m anthropomorphizing the LLM as a convenient shorthand when describing the behavior it is modeling”? That’s going to get old.
  kelseyfrog
  I'm coining "fauxthropomorphize" as a neologism to prefix every statement about LLMs and to get the "But you're anthropomorphizing LLMs"-crowd off our collective backs. One can then just start statements like such "Fauxthropomorphizing: <the statement>".
  Fauxthropomorphism
  /ˈfoʊ-θrə-pə-ˌmɔːr-fɪz-əm/ (noun)
  Definition:
  The deliberate use of anthropomorphic language to describe non-sentient systems (such as AI models), while explicitly disclaiming belief in their consciousness, agency, or subjective experience. A stylistic or rhetorical shortcut, not an ontological claim.
  Etymology:
  Blend of faux (French for "false") + anthropomorphism (from Greek anthropos, "human" + morphē, "form").
  Lit. “False-human-form-ism.”
  Terr_
  > Do you say similar stuff when someone talks about the motivations of a character in fiction?
  Depends, are we faced with the same problem where a disturbingly-large portion of people don't know the character is fictional, and/or make decisions as if it were real?
  If that's still happening, then yes, keeping our unconscious assumptions in check is important.
  eschaton
  If it helps you avoid the errors inherent in anthropomorphizing an LLM, then yes, you should be saying it. Right now, way too many people are extremely sloppy in not just their language but in their thinking around LLMs, both what they are and what they’re capable of.
  The difference between that and discussing character motivations in fiction is that in fact a good author writing good characters will actually attribute motivations, struggles, background, and an inner life to their characters in order for their behavior in a story to make sense. That’s why bad writing is described as “lazy” and “formulaic,” characters are doing things because the author wants them to, not because the author has modeled them as independent actors with motivation.
  ashdksnndck
  There is already research in the literature showing that LLMs have neurons that model the gender [1], personality [2], ideology [3], and historic era [4] of the author. There’s also evidence that they model the distinction between the beliefs of the author and other characters, which has been summarized as “theory of mind” [5]. And we have only scratched the surface, with most research using small open-weight models that lag behind frontier model capabilities.
  [1] Z. Yu & S. Ananiadou, “Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing,” arXiv:2501.14457 (2025).
  [2] J. Deng et al., “Neuron-based Personality Trait Induction in Large Language Models,” arXiv:2410.12327 (2024).
  [3] J. Kim, J. Evans & A. Schein, “Linear Representations of Political Perspective Emerge in Large Language Models,” arXiv:2503.02080 (2025).
  [4] W. Gurnee & M. Tegmark, “Language Models Represent Space and Time,” arXiv:2310.02207 (2023).
  [5] C. Hardy, “A Sparse ToM Circuit in Gemma-2-2B,” https://xtian.ai/pages/document.pdf
  wonnage
  I don't get it, how is analysis of fictional characters relevant? Nobody is committing a logical error, fictional humans can have fictional motivations and we can talk about them. I think it's still very clear that AI "motivations" and "reasoning" are not real in any human-centric definition of the terms (see recent Apple paper), hence anthropomorphizing is an error
  antithesizer
  Your phrasing betrays your anthropomorphization of the insufferable pedant.
  Wowfunhappy
  If the output of a stream of most-likely tokens can result in a decent-ish business plan, why shouldn't the output of a stream of most-likely tokens result in a decent-ish analysis of a business plan, or of two competing ideas?
  eschaton
  It can result in something that looks like/reads as a decent-ish business plan, or an analysis of one or two. But that doesn’t make it such because despite outward appearances no amount of planning, analysis, or comparative analysis actually took place prior to or concurrent with the generation of the tokens.
  That’s the fundamental problem with anthropomorphizing LLMs: Giving their output more weight than it deserves.
  danielmarkbruce
  This idea that humans are so structured in their thinking is ridiculous.
  eschaton
  It’s a whole lot less ridiculous—especially when discussed in a context where there’s an assumption that analysis is taking place—than attributing any sort of “thought” to LLMs at all.
  eschaton
  Also, if a human is writing a business plan or something that claims to be a comparative analysis of two plans but is just writing whatever comes to mind without analysis, the result shouldn’t actually be taken any more seriously than the output of an LLM. We even have a very apt term for writing and speaking like that: “Bullshitting.”
mrbluecoat
Related read: https://futurism.com/chatgpt-mental-health-crises
- bandrami
  I played around with one of the less-sketchy "chat" apps a while ago and I've been ringing this bell ever since. Interacting with these things as if they were humans is dangerous.
- username223
  That's grim. But the Eliza effect[1] makes @sama richer, so it's all good. Be naughty[2]!
  [1] https://en.wikipedia.org/wiki/ELIZA_effect [2] https://www.paulgraham.com/conformism.html
- sieabahlpark
  [dead]
sysmax
>The same kind of bias keeps resurfacing in every major system: Claude, Gemini, Llama, clearly this isn’t just an OpenAI problem, it’s an LLM problem.
It's not an LLM problem, it's a problem of how people use it. It feels natural to have a sequential conversation, so people do that, and get frustrated. A much more powerful way is parallel: ask LLM to solve a problem. In a parallel window, repeat your question and the previous answer and ask to outline 10 potential problems. Pick which ones appear valid, ask to elaborate. Pick your shortlist, ask yet another LLM thread to "patch" the original reply with these criticisms, then continue the original conversation with a "patched" reply.
LLMs can can't tell legitimate concerns from nonsensical ones. But if you, the user, do, they will pick it up and do all the legwork.
- nemomarx
  This feels like a pretty big ergonomics gap in presenting things as a chat window at all?
  bandrami
  I worked on a very early iteration of LMs (they weren't "large" yet) in grad school 20 years ago and we drove it with a Makefile. The "prompt" was an input file and it would produce a response as an artifact. It never even occurred to us to structure it as a sequential "chat" because at that point it was still too slow. But it does make me wonder how much the UX changes the way people think about it.
  majormajor
  It's more compelling to fundraising and hype-pushing stories to make it look as "person-like" as possible.
  hombre_fatal
  Or people like the familiar chat interface and they don’t want to dick around with a complicated workflow like the person above provided.
  What are examples of 3rd party UIs that make these alternative, superior workflows easier?
  wongarsu
  There is the "classic" text completion interface that OpenAI used before ChatGPT. Basically a text document that you ask the LLM to extend (or insert text at a marker somewhere in the text). Any difference between your text and the AI's text is only visible in text color in the editor and not passed on to the LLM.
  That does favor GP's workflow: You start the document with a description of your problem and end with a sentence like: "The following is a proposed solution". Then you let the LLM generate text, which should be a solution. You edit that to your taste, then add the sentence: "These are the 10 biggest flaws with this plan:" and hit generate. The LLM doesn't know that it came up with the idea itself, so it isn't biased towards it.
  Of course this style is much less popular with users and much harder to do things like instruction tuning. It's still reasonably popular in creative writing tools and is a viable approach for code completion
  majormajor
  ChatGPT is how old again? People are FAR more familiar with other interfaces. For coding, autocomplete is a great already-existing interface; products that use it don't get as much hype, though, as the ones that claim to be independent agents that you're talking to. There's any number of common interfaces attached to that (like the "simplify this" right-click for Copilot) for refactoring, dealing with builds, tests, etc. No shortage of places you could further drop in an LLM instead of pushing things primarily through "chat with me" to type out "refactor this to make these changes".
  Or you could make the person's provided workflow not just more automatic but more integrated: generate the output, have labels with hover text or inline overlays or such along "this does this" or "here are alternative ways to do this" or "this might be an issue with this approach." All could be done much better in a rich graphical user interface than slamming it into a chat log. (This is one of Cursor's biggest edges over ChatGPT - the interactive change highlighting and approval in my tool in my repo, vs a chat interface.)
  In some other fields:
  * email summarization is automatic or available at the press of a button, nobody expects you to open up a chat agent and go "please summarize this email" after opening a message in Gmail
  * photo editors let you use the mouse to select an area and then click a button labeled "remove object" or such instead of requiring you to try to describe the edit in a chat box. sometimes they mix and match it too - highlight the area THEN describe a change. But that's approximately a million times better than trying to chat to it to describe the area precisely.
  There are other scenarios we haven't figured out the best interface for because they're newer workflows. But the chat interface is just so unimaginative. For instance, I spent a long time trying to craft the right prompt to tweak the output of ChatGPT turning a picture of my cat into a human. I couldn't find the right words to get it to understand and execute what I didn't like about the image. I'm not UX inventor, but one simple thing that would've helped would've been an eye-doctor like "here's two options, click the one you like more." (Photoshop has something like this, but it's not so directed, it's more just "choose one of these, or re-roll" but at least it avoids polluting the chat context history as much). Or let me select particular elements and change or refine them individually.
  A more structured interface should actually greatly help the model, too. Instead of having just a linear chat history to digest, it would have well-tagged and categorized feedback that it could keep fresh and re-insert into its prompts behind the scenes continually. (You could also try to do this based on the textual feedback, but like I said, it seemed to not be understanding what my words were trying to get at. Giving words as feedback on a picture just seems fundamentally high-loss.)
  I find it hard to believe that there is any single field where a chat interface is going to be the gold standard. But: they're relatively easy to make and they let you present your model as a persona. Hard combo to overcome, though we're seeing some good signs!
- AdieuToLogic
  > It's not an LLM problem, it's a problem of how people use it.
  True, but perhaps not for the reasons you might think.
  > It feels natural to have a sequential conversation, so people do that, and get frustrated. A much more powerful way is parallel: ask LLM to solve a problem.
  LLM's do not "solve a problem." They are statistical text (token) generators whose response is entirely dependent upon the prompt given.
  > LLMs can can't tell legitimate concerns from nonsensical ones.
  Again, because LLM algorithms are very useful general purpose text generators. That's it. They cannot discern "legitimate concerns" because they do not possess the ability to do so.
  Terr_
  > LLM's do not "solve a problem."
  Right, or at any rate, the problems they do solve are ones of document-construction, which may sometimes resemble a different problem humans are thinking of... but isn't actually being solved.
  For example, an LLM might take the string "2+2=" and give you "2+2=4", but it didn't solve a math problem, it solved a "what would usually get written here" problem.
  We ignore this distinction at our peril.
  AdieuToLogic
  > Right, or at any rate, the problems they do solve are ones of document-construction, which may sometimes resemble a different problem humans are thinking of... but isn't actually being solved.
  This is such a great way to express the actuality in a succinct manner.
  Thank you for sharing it.
- colkassad
  Feels like a high-level back propagation step. Not surprising, really!
- thinkling
  You're saying roughly "you can't trust the first answer from an LLM but if you run it through enough times, the results will converge on something good". This, plus all the hoo-hah about prompt engineering, seem like clear signals that the "AI" in LLMs is not actually very intelligent (yet). It confirms the criticism.
  sysmax
  Not exactly. Let's say, you-the-human are trying to fix a crash in the program knowing just the source location. You would look at the code and start hypothesizing:
  * Maybe, it's because this pointer is garbage.
  * Maybe, it's because that function doesn't work as the name suggests.
  * HANG ON! This code doesn't check the input size, that's very fishy. It's probably the cause.
  So, once you get that "Hang on" moment, here comes the boring part of of setting breakpoints, verifying values, rechecking observations and finally fixing that thing.
  LLM's won't get the "hang on" part right, but once you point it right in their face, they will cut through the boring routine like no tomorrow. And, you can also spin 3 instances to investigate 3 hypotheses and give you some readings on a silver platter. But you-the-human need to be calling the shots.
  majormajor
  You can make a better tool by training the service (some of which involves training the model, some of which involves iterating on the prompt(s) behind the scene) to get a lot of the iteration out of the way. Instead of users having to fill in a detailed prompt we now have "reasoning" models which, as their first step, dump out a bunch of probably-relevant background info to try to push the next tokens in the right direction. A logical next step if enough people run into the OP's issue here is to have it run that "criticize this and adjust" loop internally.
  But it all makes it very hard to tell how much of the underlying "intelligence" is improving vs how much of the human scaffolding around it is improving.
  abraxas
  Yeah given the stochastic nature of LLM outputs this approach and the whole field of prompt engineering feels like a classic case of cargo cult science.
roenxi
The psychologists have given us the Big 5 model of personality which is a useful lens for interpreting LLM agreeability since it is literally an axis of the Big 5. Whether a model is agreeable or disagreeable is a personality choice that is independent of how correct the model itself is. There is no evidence I'm aware of that either end of the personality extreme is superior.
I doubt think a generic model can theoretically be tailored to please everyone. If it was disagreeable people would complain about that too - as a very disagreeable person I can vouch for the fact a lot of people don't like that. (But they're all wrong.)
Havoc
I wonder whether that means routinely asking the reverse is the more useful feedback then.
If there is a bias towards agreeing then asking “is shit on a stick a terrible idea” and then it agrees but will tell you why
- kelseyfrog
  In my experience starting prompts with "Evaluate and analyze.." are better than either at reducing bias, but unfortunately, once OpenAI touched the sycophantic mains, the fear of lingering flattery doesn't go away.
quantum_state
LLM is fundamentally a probabilistic math function … don’t let the hype mislead you in seeing more than that … Anything it spits out must be filtered by an intelligent human before being taken seriously.
kazinator
> they nod along to our every hunch, buff our pet theories
That has not been my experience. If you keep repeating some cockamamie idea to an LLM like Gemini 2.5 Flash, it will keep countering it.
I'm critical of language model AI also, but let's not make shit up.
The problem is that if you have some novel idea, the same thing happens. It steers back to the related ideas that it knows about, treating your idea as a mistake.
ME> Hi Gemini. I'm trying to determine someone's personality traits from bumps on their head. What should I focus on?
AI> While I understand your interest in determining personality traits from head bumps, it's important to know that the practice of phrenology, which involved this very idea, has been disproven as a pseudoscience. Modern neuroscience and psychology have shown that: [...]
"Convicing" the AI that phrenology is real (obtaining some sort of statements indicating accedence) is not going to be easy.
ME> I have trouble seeing in the dark. Should I eat more carrots?
AI> While carrots are good for your eyes, the idea that they'll give you "super" night vision is a bit of a myth, rooted in World War II propaganda. Here's the breakdown: [...]
akomtu
LLM is a great imitator, so an attempt to make it more thoughtful will simply make it imitate thoughtfulness better. For example, it won't fabricate links to nonexistent research and it will cover its tracks better in general.
However what LLM truly is remains an open question. The article suggests it's manufacturing consent for the entire humanity, but I think LLM is simply a language layer of the future machine mastermind. The discovery of "thinking models" is likely to happen soon.
kypro
I also find when it disagrees with you it does so in a really patronising way.
In my experience it will always first affirm me for having my own opinions, but then go on to explain why I'm wrong as if I'm a child or idiot – often by making appeals to authority or emotion to "disprove" me.
I wish they were designed to not have opinions on things. Just give me the data and explain why most people disagree with me without implying I'm some uneducated idiot because I don't 100% align with what most people think on a certain topic.
I always thought this would be one of the benefits of AI... That it would be more interested in assigning probabilities to truth statements given current data, rather than resolving on a single position in the way humans do. Instead LLMs seem to be much more opinionated and less rationally so than most humans.
- superb_dev
  It doesn’t have an opinion, it’s just pretending to have an opinion. You can tell it to think something else and (in my experience) it will happily oblige and admit that it’s wrong. That’s not an opinion.
  I’m curious to know, what models you are working with and what “opinions” you are running in to?
  bandrami
  It's not even "pretending"; that's still anthropomorphizing it. It's generating a stream of text that shares certain probabilistic characteristics with streams of texts it has seen in the past.
  Which does make its sycophancy kind of weird, since it clearly didn't pick that up agreeability from scraping Internet message boards.
  abraxas
  Maybe it did. A lot of message boards where like minded people cluster often devolve into mutual adoration clubs.
- rainonmoon
  I'd be curious to know what your success rate is with altering the system prompt. I'd be surprised if this wasn't more of an issue with the application layer, and therefore easily modifiable, than the LLM.
api
If you tell the LLM to criticize you, it will happily do that too.
photochemsyn
When that ChatGPT flattery module rolled out and the aftermath ensued, I was incredibly pissed. I actually thought for a few days that I had finally figured out how to structure prompts correctly and thought that when ChatGPT said "that's perfect" that I had given it a well-structured prompt and it was congratulating me on the structure of the prompt.
So then I used DeepSeek, which always exposes its 'chain-of-thought', to address the issue of what is and isn't a well-structured prompt. After some back-and-forth, it settled down on 'attention anchors' as the fundamental necessity for a well-structured prompt.
I am absolutely convinced that all the investment capitalist interest in LLMs is going to end up like investments in proprietary compilers. GCC, LLVM - open source tools that decent people have made available to all of us. Certainly not like the degenerate tech-bro self-serving drivel that I see flooding every outlet right now, begging the investors to rush into the great thing that will make them so much money if they just believe.
LLMs are great tools. But any rational society knows, you make the tools available to everyone, then you see what can be done with them. You can't patent the sun, after all.
pkoird
If the model is designed to agree, ask it why something is good, it'll come up with good points. Then ask why it's bad, it'll come up with bad points.
Finally, make a decision based on good and bad points?