What "Zero Data Retention" Actually Means — and Why the Answer Depends on Who You Ask

By Leigh Garrity— May 6, 2026

What "Zero Data Retention" Actually Means — and Why the Answer Depends on Who You Ask

What It Is, Precisely

In the LLM context, Sensitive Information Disclosure (LLM02 in the current OWASP framework) occurs when a model surfaces information from its training data, its active context window, or its retrieval layer that the requesting user shouldn't have access to. The disclosure can be direct — the model reproduces a document verbatim — or indirect, where the model's response reveals the existence or structure of information without quoting it. The risk isn't limited to model outputs. It extends to every system that processes, stores, or indexes the data that flows through the AI pipeline. The model is one component in a larger data-handling chain, and each component has its own persistence behavior.

Three Pathways Out

Chat logs are the most visible pathway and the one most vendors have addressed in their terms of service. Every prompt and completion gets written to a log somewhere — for abuse detection, for debugging, for billing reconciliation. Retention windows vary: 30 days is common for commercial offerings, longer for enterprise tiers with audit requirements. The data in those logs is enterprise data. If the vendor's infrastructure is breached, or if a misconfigured API exposes log access, that data is out. Contractual no-training clauses address whether the vendor uses this data; they say nothing about whether it persists.

Embedding stores are less obvious and increasingly consequential. Retrieval-augmented generation (RAG) architectures work by chunking source documents into segments — typically a few hundred tokens each — converting them to vector representations, and storing those vectors in a database the model queries at inference time. Vector embeddings are not one-way: research has demonstrated that source text can be approximately recovered from embeddings under certain conditions, which means the embedding store is not a sanitized representation of your documents. More practically, the retrieval layer doesn't automatically inherit your access control model. A user who can query the AI assistant may be able to surface chunks from documents they couldn't access in the source system, because nobody mapped the document permissions to the retrieval index. That's a data governance failure, not a model failure — but the model is what makes it visible.

Fine-tuning datasets represent the most durable exposure. When an organization fine-tunes a model on proprietary data, that data influences the model's weights. Unlike a database record, it can't be deleted with a query. It can be extracted — partially, probabilistically — through carefully constructed prompts, a technique the research community has documented under the term "training data extraction." The exposure is proportional to how frequently a given piece of text appeared in the training set. Repeated documents, templates, and boilerplate are at higher risk than unique records. Before you fine-tune on anything, treat the training dataset as if it will eventually be accessible to anyone with API access to the model.

IDAM Concept Mapping

Closest analog: token revocation and session persistence. In OAuth/OIDC, a vendor can contractually commit to short-lived tokens while the architecture quietly maintains long-lived sessions. You can verify the commitment by inspecting token expiry claims, testing the revocation endpoint, and auditing actual behavior. The ZDR analogy holds here: both involve a promise about data persistence that may or may not be architecturally enforced. Where it breaks: with tokens, you have observable artifacts — the JWT, the revocation response, the session log. With LLM data retention, the persistence happens inside the vendor's infrastructure. You cannot inspect it the way you can inspect a token. The asymmetry of verifiability is worse, and that asymmetry is exactly what makes the contractual/technical ZDR distinction matter.

The Practical Test

A federal civilian agency evaluating a commercial LLM for internal document summarization asks the vendor: "Do you offer zero data retention?" The vendor says yes. That answer is almost certainly true. It is also almost certainly incomplete.

A ZDR claim can mean two things, and they are not the same thing.

A contractual ZDR commitment means the vendor has agreed, in writing, not to use your data to train or improve their models. Your prompts and completions are excluded from their training pipeline. This is a legal commitment. It does not mean your data never touches persistent storage. It does not mean your data is inaccessible to the vendor's operations team. It does not mean a breach of their infrastructure wouldn't expose your data. It means they promised not to train on it.

A technically verifiable ZDR architecture means the system is designed so your data never persists beyond the inference call. Ephemeral compute, no logging, data destroyed after response generation. This is auditable in principle — you can ask for architecture documentation, third-party attestation, and SOC 2 controls that specifically address inference-time data handling. In practice, most commercial LLM offerings do not offer this by default. Some enterprise tiers do, at a price premium, with audit rights that are narrower than buyers assume.

The procurement question that cuts through: "Is your ZDR commitment contractual, architectural, or both — and what third-party attestation covers the architectural claim?" A vendor who can answer that precisely is offering something different from one who cannot.

The Control Set

Prompt-boundary DLP intercepts sensitive content before it reaches the model — pattern-matching on PII, credentials, and classified markers at the API layer. It's the most operationally tractable control and the most likely to generate false positives that erode user adoption. Tenant isolation ensures that one organization's data — including their embedding store and fine-tuning artifacts — cannot be accessed by another tenant's queries. Retention windows govern how long inference-time logs persist; shorter windows reduce exposure but complicate incident investigation. None of these controls address the ZDR architectural question. They govern what happens to data that enters the system; ZDR governs whether it stays.

The gap between a contractual commitment and a technical architecture is where enterprise data actually lives. Knowing which one you've purchased is the first step to knowing what you're managing.