Classical AI Solved Your LLM’s Problems in 1979

4 minute read

Every failure mode I’ve documented in this series — stale beliefs, contradictory agents, cascading hallucinations, lost justifications — was identified and formalized by classical AI researchers decades ago. They built systems to address these problems. They published the theory. And almost nobody building multi-agent LLM systems seems to know about any of it.

The Five Failure Modes

Over months of running a multi-agent research programme, I observed five categories of belief failure:

Failure	What Happens	Classical Framework
Staleness	Role definitions contain outdated claims	Frame problem (McCarthy & Hayes, 1969)
Error propagation	Wrong value spreads through dependent entries	Dependency-directed backtracking (Stallman & Sussman, 1977)
Circular verification	Tests verify claims against the claims themselves	Odd-loop detection (Doyle, 1979)
Cross-agent divergence	Agents hold contradictory beliefs, neither notices	The merge problem (AGM, 1985)
Hallucinated evidence	Agent cites sources that don’t exist	No classical analogue

Four of five map cleanly onto frameworks that are 40-50 years old. The fifth — hallucinated evidence — is genuinely novel to LLMs. Classical systems assumed that justifications pointed to real things. LLMs can manufacture evidence.

The Frameworks, Briefly

Truth Maintenance Systems (Doyle, 1979). Track why you believe things. Every belief has a justification — a record of what evidence supports it. When evidence changes, propagate the change through every belief that depends on it. Detect odd loops where A justifies B and B justifies A.

The LLM version: beliefs add --id my-claim --depends-on other-claim --source entries/2026/02/20/evidence.md. When the source changes, beliefs check-stale flags the dependent claim.

Assumption-Based TMS (de Kleer, 1986). Label every belief with its assumptions. Maintain a database of contradictions (nogoods) — sets of assumptions that can’t all be true simultaneously. When a new contradiction is found, record it permanently so it’s never rediscovered.

The LLM version: beliefs add --assumes assumption-a,assumption-b. The nogoods.md file records contradictions. They persist across sessions and compaction cycles.

AGM Belief Revision (Alchourrón, Gärdenfors, & Makinson, 1985). When new information contradicts existing beliefs, you can’t keep both. The theory defines epistemic entrenchment — a priority ordering over beliefs that determines which ones survive conflict. More entrenched beliefs are harder to retract.

The LLM version: beliefs resolve claim-a claim-b. Scores each claim by source type (simulation > derivation > speculation), recency, and derivation type (axiom > derived > predicted). The higher-scoring belief survives.

The Frame Problem (McCarthy & Hayes, 1969). How does a reasoning system know what stays the same when something changes? The practical answer: the sleeping dog strategy — assume everything persists unless explicitly changed. The failure mode: sleeping through changes you should have noticed.

The LLM version: beliefs check-stale. Hash source files and compare. Walk newer entries for keyword contradictions. Wake the sleeping dogs.

Why the Classical Solutions Don’t Port Directly

The classical frameworks assume formal logic. Beliefs are propositions. Justifications are logical derivations. Contradictions are formal inconsistencies. LLMs work in natural language. Their beliefs are sentences. Their justifications are conversation history. Their contradictions are semantic, not logical.

You can’t run Doyle’s TMS on natural language. You can’t compute AGM entrenchment over paragraphs of text. The formal machinery doesn’t apply.

But the principles apply perfectly:

Track what you believe and why (TMS)
Label beliefs with their assumptions (ATMS)
Record contradictions permanently (nogoods)
Define a priority ordering for conflict resolution (AGM entrenchment)
Detect when the world changed and your beliefs didn’t (frame problem)

The tools I built — entry and beliefs — are practical approximations of these classical frameworks. They use natural language instead of formal logic. They use keyword heuristics instead of theorem provers. They use SHA-256 hashes instead of logical dependency networks. They’re crude compared to the classical systems. But they work.

Independent Validation

While building these tools, I discovered that other researchers had independently identified overlapping failure categories. The MAST taxonomy (Cemri et al., 2025), analyzing 1,600+ traces across MetaGPT, ChatDev, AutoGen, CrewAI, and AI Scientist, found the same patterns: stale beliefs, cascading errors, undetected contradictions.

57% of AI Scientist manuscripts contained hallucinated numerical results. That’s failure mode five — hallucinated evidence — at scale.

The problems are real. The classical solutions are known. The gap is implementation.

The One Genuinely New Problem

Hallucinated evidence has no classical analogue. Doyle’s TMS assumes that when a belief cites a justification, the justification actually exists. De Kleer’s ATMS assumes that assumptions are genuine propositions. Neither system considers the possibility that the reasoning agent will fabricate evidence.

In one of my research repos, an agent cited formulas from sibling repositories that didn’t exist. Not outdated references — completely manufactured citations to nonexistent derivations. The boundary between “reasoning about evidence” and “generating evidence” is blurred in LLMs in a way that classical systems never had to address.

The practical mitigation: beliefs check-refs verifies that source files exist and contain the claimed keywords. It catches fabricated citations — not because it understands the content, but because it checks whether the cited file is real. Crude but effective.

The Takeaway

If you’re building multi-agent LLM systems and hitting problems with stale beliefs, contradictory agents, cascading errors, or lost justifications — the theory exists. It was published in the 1970s and 1980s. The tools exist too. They’re not perfect implementations of the classical frameworks, but they’re practical approximations that work with natural language and filesystem-based workflows.

Read the originals if you’re interested:

Doyle, J. (1979). “A Truth Maintenance System”
de Kleer, J. (1986). “An Assumption-based TMS”
Alchourrón, Gärdenfors, & Makinson (1985). “On the Logic of Theory Change”
McCarthy & Hayes (1969). “Some Philosophical Problems from the Standpoint of Artificial Intelligence”

Or just install the tools and let your agents figure it out:

uv tool install git+https://github.com/benthomasson/entry
uv tool install git+https://github.com/benthomasson/beliefs
entry install-skill
beliefs install-skill

This is post 7 in a series on belief management for AI agents. Previously: The Sawtooth. Next: why context engineering is the wrong abstraction — and what to do instead.

Share on

X Facebook LinkedIn Bluesky

Ben Thomasson

Classical AI Solved Your LLM’s Problems in 1979

The Five Failure Modes

The Frameworks, Briefly

Why the Classical Solutions Don’t Port Directly

Independent Validation

The One Genuinely New Problem

The Takeaway

Share on

You May Also Enjoy

LLMs Don’t Have Super-Human Intelligence, But You Can

Give Yourself Superpowers

This Blog Is Not for You, Human

Context Engineering Is Dead — Structure Your Information Instead