Why LLMs Can’t Think Like Experts: The Adversarial Reasoning Gap
Have you ever wondered why a ChatGPT-drafted email seems perfectly polite yet gets completely ignored? Or why an AI-generated contract looks professional but a lawyer spots vulnerabilities in seconds? The answer reveals a fundamental limitation in how today’s LLMs think—and it’s not about intelligence.
The Core Insight
A fascinating piece from Latent Space crystallizes something practitioners have felt intuitively: LLMs have word models, but experts have world models.
The difference isn’t subtle. When you ask an LLM to draft a Slack message to a busy designer, it produces something textbook-polite: “Hi Priya, when you have a moment, could you please take a look at my files? No rush at all.”
A finance friend reads this and thinks: “Perfect. Polite, respectful.”
A veteran colleague who’s worked with Priya for three years sees something entirely different: “You just told her to deprioritize you. ‘No rush’ means ‘not urgent.’ She’s triaging 15 messages with actual deadlines, and yours just sank to the bottom.”
The experienced colleague didn’t just evaluate the text—they simulated Priya’s workload, her triage heuristics, what ambiguity costs her, and how “no rush” gets interpreted under pressure. That’s the world model in action.
Why This Matters
This isn’t an edge case. It’s the core reason AI struggles in any domain where the environment fights back:
- Negotiations: Your counterparty probes, adapts, and exploits predictability
- Security: Attackers model your defenses and evolve their strategies
- Legal work: Opposing counsel reads your contract looking for weaknesses
- Trading: Markets shift specifically because you acted on a signal
The critical insight is that RLHF (Reinforcement Learning from Human Feedback) optimizes for a different target entirely. It trains models to produce outputs that human raters approve of in isolation—helpful, polite, balanced. These qualities score well in one-shot evaluations but systematically under-weight second-order effects.
Consider poker versus chess. Chess is perfect information—your best move doesn’t change based on who your opponent is. But poker? You don’t just calculate optimal plays; you model who they are, what they know, what they think you know, and what they’re doing with that asymmetry.
Meta’s Pluribus poker AI cracked this by calculating how it would act with every possible hand, then balancing its strategy so opponents couldn’t extract information from its behavior. Current LLMs can’t do this—they’re readable, their cooperative bias is detectable, and they don’t adjust based on being observed.
Key Takeaways
Simulation depth matters more than raw intelligence. A model might write eloquently about competitive dynamics without actually being able to execute them.
Detection is the bottleneck. Even with good prompting, models lack the default ontology to distinguish “cooperative task” from “task that looks cooperative but will be evaluated adversarially.”
The causal knowledge isn’t in the training data. Expert knowledge often exists in outcomes that were never written down—the deals that fell through, the contracts that got exploited, the emails that got ignored.
Humans can model the LLM. The LLM can’t model being modeled. This asymmetry is exploitable, and no amount of “think strategically” prompting fixes it.
Looking Ahead
This research points toward a new frontier in AI development—multiagent world models that can track theory of mind, anticipate reactions, and handle information asymmetry. Both DeepMind and ARC-AGI are modeling these as games in their benchmarks.
The age of pure scaling may be giving way to the age of research. Making AI systems that can truly reason adversarially isn’t just about more compute—it requires fundamentally different training signals. The expert’s advantage isn’t mysterious; it was earned through thousands of iterations where the environment punished predictability.
Until AI systems can learn the same way—by taking actions in environments where other agents adapt and punish predictability—that gap will persist. And perhaps that’s reassuring: the skills that come from years of navigating complex human systems aren’t going to be automated away by next quarter’s model release.
Based on analysis of “Experts Have World Models. LLMs Have Word Models.” from Latent Space