How LLMs actually work, explained without the math.

Everyone uses these things now. Most of us couldn't explain how they work if pressed. Here's the version that fits in 8 minutes and doesn't require a single equation from school.

AI LiteracyBeginner GuidesAI at Work

Published May 13, 2026

6 min read

You use ChatGPT or Claude every day, but if I asked you to explain how the thing actually works, you'd probably say "magic" and move on. That's fine. You don't need to know how a car engine works to drive one.

But I'd argue knowing the rough outline makes you better at using the tool. You stop expecting it to do things it can't. You stop being surprised when it gets things wrong in predictable ways. So here's the version that fits in 8 minutes. No math.

What an LLM is, in one sentence

An LLM (large language model) is a giant pattern-matching machine that, given some text, predicts what word comes next. That's it. That's the whole core idea.

Everything else — chat, code, analysis, summarization — is built on top of "predict the next word, very well, billions of times in a row".

How it learns

Imagine you took every book, every news article, every Stack Overflow answer, every Wikipedia page — call it roughly the readable internet — and showed it to a computer. For every sentence in that pile of text, you covered up the last word and asked the computer: "what word goes here?"

At first the computer guesses randomly. But it does this trillions of times, with a learning rule that says "if you guessed wrong, adjust slightly toward the right answer". Eventually, after enough rounds, it gets very good. Not just at predicting the next word in a sentence — it learns the patterns of language itself. Grammar. Facts. Reasoning shapes. Style. The whole texture of how humans write.

This is called "pre-training". It's what makes the model "large". The "large" refers to two things: the amount of text it was trained on (terabytes) and the number of internal parameters it uses to remember patterns (often hundreds of billions).

Why it can answer questions, write code, etc.

Because answering a question and predicting the next word turn out to be the same task, if you frame it right.

When you ask "What's the capital of France?", the model isn't looking up an answer in a database. It's predicting what word comes after "What's the capital of France?\n\nAnswer:" — and the right next word, based on the patterns it learned, is "Paris".

For code: the model has read millions of programs. When you say "write a Python function to sort a list", it predicts what code typically follows that kind of request. The output looks like real Python because the patterns of real Python are baked into its weights.

The plain-English mental model

An LLM doesn't know things the way you know your name. It predicts which words probably come next, and that prediction is good enough to look like knowledge most of the time. When it's wrong, it's not "lying" — it's just predicting words that fit the pattern, regardless of whether those words are true.

What this explains about its weirdness

Once you have the "predict next word" framing, a lot of LLM behavior makes sense.

It hallucinates. It makes up facts confidently. Because it doesn't have a fact lookup — it has a "what word probably comes next" lookup. If the next word it's about to predict is plausible-sounding but wrong, it doesn't know to stop. It just continues.

It's better at some tasks than others. Tasks the internet had lots of examples of (writing Python, summarizing English text, drafting emails) work great. Tasks the internet had few examples of (writing a poem in a specific Yiddish dialect, doing 8-digit multiplication) work poorly.

It's good at sounding sure. Because it learned from text that sounds sure. The confidence in the writing it produces is borrowed from the texts it read. It does not reflect actual confidence in being right.

It has a "knowledge cutoff". The model only knows what was in its training data. If you trained it last June, it doesn't know about anything that happened in August. Some products plug in web search to work around this. The model itself still doesn't know.

The "tuning" step

After pre-training, the raw model is a brilliant but unguided text predictor. It would happily continue any text in any direction — including unhelpful, rude, or unsafe directions.

So labs do a second stage called fine-tuning. They show the model thousands of examples of "here's a question, here's a good answer" or "here's a request, here's a refusal that explains why". They use human raters to score outputs. They use those scores to nudge the model toward helpful, polite, safe behavior.

This is RLHF (reinforcement learning from human feedback). It's what turns a raw text predictor into a chatbot you'd actually want to talk to.

What this means when you use it

Three practical things to take from all of this.

One. Be specific. The model is predicting what comes next based on what you typed. The more specific you are, the more useful its prediction.

Two. Don't trust facts you can't verify. The model is excellent at producing plausible-sounding text. "Plausible" is not the same as "correct".

Three. Notice what it's good at. It's good at things where there's lots of similar text on the internet — writing emails, summarizing meetings, debugging common errors, restating ideas in simpler language. It's worse at things that require fresh facts or careful arithmetic.

What you don't need to learn

You don't need to learn the math behind transformers. You don't need to know how attention heads work. You don't need to memorize what "softmax" is.

What you do need is a working mental model of "fancy pattern-matching, trained on the internet, predicts text that looks like more of that internet". That model is correct enough to be useful, and it explains almost everything you see when you actually sit down with one of these tools.

The math is fine. The math is even beautiful. But the math is not what makes you a better user of the tool. That comes from understanding what the tool is doing, in plain language, and using it accordingly.