Hallucination vs. Confabulation: Two Words for the Same Problem, and Why One of Them Is Lying to You

By Leigh Garrity— May 9, 2026

The Two Framings

Hallucination

What it is: A model output that is factually incorrect, presented with the same confidence as a correct output.

What it does: It gives buyers a mental model in which the model normally tells the truth and occasionally malfunctions. The hallucination is the exception. The correct answer is the baseline. The model "went wrong" on this particular prompt.

Where it comes from: The term entered mainstream AI coverage around 2020–2021, borrowed loosely from psychiatry, and was popularized by journalists and product teams who needed a word for "the model said something false and sounded sure about it." It stuck because it's vivid and because it implies something dramatic enough to explain why the output was wrong. OpenAI, Google, and Anthropic all used it in early documentation. It's now in the AP Stylebook's AI guidance. That's how you know a technical term has left the building.

What makes it distinct: The framing implies a deviation. The model has a truth-telling mode, and hallucination is what happens when that mode fails. This is the part that matters — and the part that's wrong — because it shapes every governance conversation that follows. If hallucinations are exceptions, buyers ask: how do we catch the exceptions? If they're not exceptions — if they're the normal output of a system that has no truth-telling mode — buyers need to ask something different entirely.

Confabulation

What it is: A model output that fills a gap in grounded knowledge with a plausible-sounding completion — not because the model is malfunctioning, but because producing plausible completions is what it was built to do.

What it does: It gives buyers a mental model in which the model is always doing the same thing: generating outputs that fit the pattern of what a correct answer would look like. Sometimes those outputs happen to correspond to facts. Sometimes they don't. The model cannot tell the difference, because it has no mechanism for checking. There is no internal "I don't know" signal unless one has been explicitly trained in.

Where it comes from: The term is borrowed from neuropsychology, where it describes a specific behavior in patients with certain memory disorders: they fill gaps in their recall with plausible-sounding fabrications, without awareness that they're doing so, and without intent to deceive. The fabrication feels like memory to them. Researchers including Gary Marcus and Yejin Choi began applying the term to language models around 2022–2023, arguing that it was mechanically more accurate than "hallucination." The AI safety and alignment communities have largely adopted it. You'll hear it from CAIOs who've read the primary literature, and from technical buyers who are specifically testing whether you have.

What makes it distinct: The framing implies a constant. The model is always confabulating — always producing outputs shaped by pattern rather than by verified knowledge. On most prompts, the confabulation happens to match reality. On some prompts, it doesn't. "Did the model confabulate?" is never the right question. "Did the confabulation land?" always is.

“

Okta Concept Mapping — SAML Assertion Trust

When a SAML IdP issues an assertion, the service provider trusts it. The SP doesn't independently verify the claim — it trusts the chain that produced the claim. The governance question isn't "is this assertion hallucinated?" It's "is the trust chain sound?" Model outputs work similarly: the output is always an assertion. The question is whether the grounding behind it — retrieval, fine-tuning, system prompt constraints — is sound enough to trust. Where the analogy breaks: a SAML assertion has a defined issuer, a signature, and an expiration. A model output has none of those. In a buyer conversation, this is useful for explaining why RAG and grounding matter — they're the closest thing to a trust chain the model has.

Comparison Strategy

Two framings of the same behavior — which means a standard trait-by-trait comparison would produce two nearly identical columns and miss the point entirely.

The structure here is framing analysis: for each term, what does it imply about the model's normal behavior, what mental model does it hand the buyer, and what governance response does it enable or foreclose? The comparison is between implications. The underlying behavior is identical.

What each framing implies about the model's normal behavior:

"Hallucination" implies the model has a default mode of accuracy that occasionally breaks down. This is intuitive, it maps to how humans think about mistakes, and it is incorrect. Models don't have an accuracy mode. Plausibility is the only mode, and it runs continuously.

"Confabulation" implies the model has one mode: produce the most plausible completion. Accuracy is a property of the output, not a property of the process. This is harder to hold in your head, but it's what's actually happening.

What mental model each framing hands the buyer:

The hallucination framing hands the buyer a quality-control problem. The model is mostly right; you need to catch the exceptions. This leads to questions like "what's the error rate?" and "can we get it below X%?" These are not bad questions, but they're incomplete — they assume the errors are randomly distributed and detectable, which they're not.

The confabulation framing hands the buyer a grounding problem. The model is always generating; the question is whether the generation is anchored to verified information. This leads to questions like "what sources is the model drawing from?" and "how do we constrain the output space?" Those questions map directly to architectural decisions — retrieval-augmented generation, system prompt design, output verification layers. They're more actionable.

What governance response each framing enables:

The hallucination framing tends to produce two responses: dismissal ("AI is too risky") or overconfidence ("we've reduced hallucinations to under 2%," which is a claim that should make you nervous every time you hear it). Neither response is useful. Dismissal forecloses the conversation. Overconfidence sets up a trust collapse when the 2% shows up in a live demo.

The confabulation framing produces constraint design. If the model always confabulates, governance becomes about constraining the confabulation to domains where it's likely to land correctly, and verifying outputs in domains where the cost of a miss is high. This is a solvable engineering and policy problem. Not solved — but the right shape of problem.

Where the terminology debate actually sits:

The field is genuinely divided, and you should know that before you walk into a room with a CAIO who has opinions. The "hallucination" camp argues that the term is entrenched, buyers understand it, and replacing it with "confabulation" is academic pedantry that doesn't change the practical response. The "confabulation" camp argues that the term shapes the mental model, and the wrong mental model produces the wrong governance response. Both camps have a point. This piece doesn't resolve it. What it gives you is enough to follow the conversation wherever the buyer takes it.

“

Okta Concept Mapping — Least Privilege and Blast Radius

The governance response to confabulation looks structurally similar to the governance response to over-privileged accounts. You don't eliminate the risk — you constrain the blast radius. For accounts, that means scoping access to the minimum required for the task. For model outputs, that means scoping the model's input context to verified sources, constraining the output to defined formats, and adding verification layers for high-stakes decisions. Where the analogy breaks: you can audit an access event. You cannot audit what the model "knew" versus "generated" at inference time — the model doesn't log its reasoning in a way that's independently verifiable. In a buyer conversation, this is useful for explaining why human-in-the-loop requirements exist for certain output categories, and why they're not just compliance theater.

How to Say This in the Field

Every "Do say" below is usable verbatim. The scenarios cover buyers who catastrophize, buyers who dismiss, and technical buyers who are testing whether you've done the reading.

Don't say	Do say	Why it matters
"AI hallucinates sometimes, but we can minimize it."	"The model always generates based on pattern — the question is whether we've grounded it well enough for this use case."	Implies a fixable defect; the alternative frames it as an architectural question, which is where the real conversation lives.
"Hallucinations are a known limitation, we're working on it."	"There's no version of this technology that produces outputs you don't verify — the question is where you put the verification."	Buyers who hear "working on it" hear "not ready." Buyers who hear "where you put the verification" hear a governance conversation they can participate in.
"That's a great point about hallucinations — we take that seriously."	"You're right that ungrounded outputs are a real risk. What's the highest-stakes decision this system would be informing?"	The first sentence is a stall. Moving to the specific use case gets you to a risk that's real and a mitigation that's designable.
"Actually, the more accurate term is confabulation."	"Some researchers prefer 'confabulation' because it captures that the model is always generating, not occasionally malfunctioning — it's a useful framing if you want to think about where to put guardrails."	Correcting a buyer's terminology is a losing move. Offering the alternative framing as a tool is a different thing entirely.
"Our model has a very low hallucination rate."	"The accuracy rate on this use case, with these retrieval sources, in our testing, was X — here's what that testing covered and what it didn't."	Accuracy rates without scope are meaningless and will be tested. Scoped accuracy rates with disclosed methodology are credible.
"Confabulation just means the model makes things up."	"Confabulation means the model produces plausible completions — which is what it's always doing. The question is whether the completion is anchored to verified information."	"Makes things up" implies intent and randomness. The actual behavior is neither intentional nor random — it's predictable given the input, which is why it's governable.
"AI is just not reliable enough for government use cases yet."	"Reliability depends on the use case and the verification layer. What's the tolerance for error in this specific workflow?"	Closes the conversation versus opening a scoping discussion where you can actually help.
"We've solved the hallucination problem with RAG."	"RAG reduces the risk of ungrounded outputs by anchoring the model to a defined document set — it doesn't eliminate the risk, it constrains it to that document set."	"Solved" is the word that comes back to haunt you. "Constrains" is accurate and still sells the capability.
"The CAIO is worried about hallucinations — that's a technical concern, not a procurement one."	"The CAIO's concern maps directly to the verification and audit requirements in the ATO package — let's pull that thread."	Translating a technical concern into a procurement artifact is the move. It's also true.
"Don't worry, the model will tell you when it doesn't know something."	"Some models are fine-tuned to express uncertainty — but that expression is itself a generated output, not a ground-truth signal. It's useful, not definitive."	This is the one that will get you in trouble with a technical buyer if you get it wrong. The model's "I'm not sure" is as generated as everything else it says.

“

Okta Concept Mapping — Continuous Authentication and Output Verification

Okta's continuous access evaluation (CAEP) model treats authentication not as a one-time gate but as an ongoing signal — trust is re-evaluated as context changes. Model output verification works similarly: you don't trust the output once at generation and move on, you build verification into the workflow at the points where the output informs a decision. Where the analogy breaks: CAEP has defined signals (token revocation, session anomaly) that trigger re-evaluation. Model output verification doesn't have equivalent signals — you can't tell from the output alone whether it's grounded. In a buyer conversation, this is useful for explaining why output verification isn't optional overhead — it's the functional equivalent of the session check that CAEP performs.

The One Thing to Hold Onto

The terminology debate is real and it's not resolved. You don't need to resolve it in the room. What you need is the underlying concept, stated simply enough to use under pressure:

The model doesn't have a truth-telling mode that occasionally fails. Plausibility generation is the only mode, and it runs continuously. On most prompts, plausible and correct overlap. On some prompts, they don't. The model cannot tell which situation it's in, and neither can you, without verification.

"Hallucination" is one word for that. "Confabulation" is another. The buyer's word is fine. Your job is to make sure the conversation that follows is about grounding, verification, and constraint design — not about whether the technology is trustworthy in the abstract. It's not trustworthy in the abstract. Nothing is. The question is always what you've built around it.

The Context Window covers AI concepts for public sector identity and access management practitioners. Accuracy rates and benchmark figures cited in field scenarios are illustrative; production sourcing would verify against current model documentation.