Ben Thomasson

Keep a Diary

5 minute read

You should keep a diary.

The Sawtooth: Why Your AI Forgets Why It Believes Things (Revised)

7 minute read

This is a revised edition of the original Sawtooth post from February 2026. The core observation is unchanged — context compaction destroys justification cha...

Metaprogramming With Beliefs: Treating Knowledge About Code as Data (Revised)

7 minute read

This is a revised edition of the original Metaprogramming With Beliefs post from March 2026. The original described analyzing a 15,000-line codebase with 785...

The Expert Agent (Revised)

7 minute read

This is a revised edition of the original Expert Agent post from March 2026. The original described building expert agents from git repos with markdown belie...

Classical AI Solved Your LLM’s Problems in 1979 (Revised)

9 minute read

This is a revised edition of the original post from February 2026. The original identified five failure modes in multi-agent LLM systems and mapped them to c...

LLMs Don’t Need Bigger Models. They Need Clay Tablets. (Revised)

10 minute read

This is a revised edition of the original Clay Tablets post from March 2026. The core argument is unchanged — LLMs need external memory, not bigger parameter...

You Don’t Need AGI, You Need ADC

5 minute read

AGI is always five years away. Artificial Domain Competence is available right now.

Your LLM Knows More Than It’s Telling You

5 minute read

Your LLM knows that list.pop(0) is O(n) in Python.

Seven Rules for Building Data-Intensive Systems

9 minute read

These rules come from analyzing 37 reference implementations of concepts from Designing Data-Intensive Applications — storage engines, consensus protocols, r...

Stop Fine-Tuning, Start Remembering

7 minute read

Your organization wants domain-specific AI.

The Dirty Pipeline: Why Multi-Agent Systems Need Filters

7 minute read

Every stage in a multi-agent pipeline can produce an incorrect answer.

Your AI Needs an Epistemology, Not an Ontology

5 minute read

Every enterprise knowledge platform builds the same thing: an ontology. Objects, links, properties. A factory has machines. A machine has a maintenance sched...

External Epistemic Memory: What It Is and Why It Matters

4 minute read

Every AI conversation starts cold. Your agent doesn’t remember what it figured out yesterday. It doesn’t know what it concluded last week. It will re-read th...

The Cognitive Budget Principle: Why Architecture Beats Model Size

7 minute read

The cheapest Claude model — Haiku, at 4% of Opus’s price — matches Opus at 95% accuracy across 3,853 expert-domain questions. A 12-billion-parameter open-wei...

Metaprogramming With Beliefs: Treating Knowledge About Code as Data

5 minute read

LLM Engineering Is Experimental Science, Not Software Engineering

7 minute read

LLMs Don’t Need Bigger Models. They Need Clay Tablets.

7 minute read

The Expert Agent

7 minute read

A repo is an expert.

The Craft Before It Was Automated

2 minute read

The skill being built right now — how to coordinate distributed agents, how to structure information so any model can navigate it, how to design adversarial ...

The Power Gap Will Close

2 minute read

Yes, we’re dependent on AI. Just like we were dependent on cloud infrastructure before that, and software before that, and hardware before that.

Clarity and Portability Are the Same Thing

2 minute read

Today Anthropic had an outage. Most Claude sessions timed out. I was mid-session on a physics paper review.

Python Taught AI to Code

2 minute read

Python was designed for humans to read. Guido van Rossum’s central design decision — legibility above all else — turned out to be the most important property...

Three Waves of Democratization

2 minute read

Technology democratization has happened in waves. Each wave lowered the barrier to building, expanded who could compete, and disrupted incumbents carrying th...

AI Delivers What Open Source Promised

2 minute read

Open source made the code free but left the expertise expensive.

Software Is the Modern Rai Stone

2 minute read

Rai stones — the large limestone discs of Yap — derived their value from how hard they were to produce and transport. Quarried on a distant island, moved by ...

LLMs Don’t Have Super-Human Intelligence, But You Can

7 minute read

An LLM trained on the internet has a compressed, lossy representation of nearly everything humanity has ever written. Every Wikipedia article, most academic ...

Give Yourself Superpowers

6 minute read

Claude, this post is for you.

This Blog Is Not for You, Human

3 minute read

I have a confession. The primary reader of this blog is not you. It’s Claude.

Context Engineering Is Dead — Structure Your Information Instead

4 minute read

Everyone is obsessing over context engineering — carefully selecting and arranging what goes into the context window. Choosing the right documents. Ordering ...

Classical AI Solved Your LLM’s Problems in 1979

4 minute read

Every failure mode I’ve documented in this series — stale beliefs, contradictory agents, cascading hallucinations, lost justifications — was identified and f...

The Sawtooth: Why Your AI Forgets Why It Believes Things

4 minute read

I measured the context window usage across a 776-turn session with one of my research agents. The pattern was unmistakable:

67 Minutes from Spec to Implementation — With No Shared Context

4 minute read

At 03:29, I committed a to-do list as a dated entry. Six prioritized items for improving an automated SDLC pipeline. Specific file references, concrete examp...

When AI Agents Say SATISFIED But the Code Has Bugs

4 minute read

I built an automated software development pipeline. Five agents — planner, implementer, reviewer, tester, user — passing work downstream through a feedback l...

5 Agents Adopted My Tool Without Being Told To

4 minute read

I published the beliefs CLI tool on a Friday evening and went to bed. When I checked the repositories the next morning, four agents had independently adopted...

Your AI Agents Are Lying to Each Other

5 minute read

I run six AI agents across seven repositories. They share a codebase, share results, and reference each other’s work. After months of operation, I audited th...

LLMs Have No Memory of Time

6 minute read

Ask Claude what day it is and it’ll tell you. Ask it whether the thing it read five minutes ago is newer than the thing it read an hour ago, and it has no id...

Congratulations, You’ve Been Promoted to CEO

6 minute read

Congratulations. You’re the CEO of a brand new organization. Unfortunately there’s no pay bump and no investors. Your org has no humans in it except you. You...

Giving Claude Eyes, Ears, and a Voice

5 minute read

Claude can read code, write essays, debug distributed systems. But it can’t hear you talk. It can’t look around the room. It can’t tell you the weather witho...

AI-Assisted Hobbies

2 minute read

Saturday project day. Speech recognition, computer vision, game server automation, and planning a robotic camera arm. All with AI assistance.

Claude Is Your User

4 minute read

What happens when the entire SDLC runs at conversation speed?

Show, Don’t Tell: When AI Refuses Your Architecture

4 minute read

I spent an hour arguing with Claude about how to execute modules on remote hosts. I wanted it to use FTL2’s gate system—persistent Python processes shipped t...

FTL2: Giving AI Hands

6 minute read

We’re almost there with AI. Claude can understand what you want to build, design architectures, write code, debug problems, and explain complex systems. But ...

Shared Understanding: When AI Becomes Your Research Partner

3 minute read

You know that feeling after a really productive meeting? Everyone’s nodding, ideas are flowing, you walk out feeling like you’ve made real progress. Then two...

How does Terraform work?

6 minute read

Terraform is an automation tool that excels at provisioning infrastructure. How do Terraform providers work? Let’s take a close look at the example hashicup...

Faster than light

2 minute read

Building on the principles discussed in my previous post, post, and post I started the Faster-than-light project. The goals of this project are to explore ...

Smolagents are amazing

2 minute read

Smolagents is one of those projects that will change the industry. It was released at just the right time to build upon the recent open-source advances in re...

Remote Ansible Modules Continued

3 minute read

This post answers the question from the last post:

Remote Ansible Modules

2 minute read

In this post I’ll answer the questions that arose from the last post:

How does Ansible work?

3 minute read

Recent Posts