How LLMs Exfiltrate Data, and What Zero-Data-Retention Actually Guarantees

By Carey Whitten— May 5, 2026

How LLMs Exfiltrate Data, and What Zero-Data-Retention Actually Guarantees

Sensitive Information Disclosure (SID) ranks second on the OWASP LLM Top 10. The exfiltration pathways are embedded in normal, intended system behavior. No exploit required. The data moves because the system is working as designed — and the enterprise didn't fully account for where "working as designed" ends up sending things.

What SID Is, Precisely

Sensitive Information Disclosure occurs when an LLM system exposes data that was not intended to be accessible to a given requester — through direct output, through inference from model behavior, or through downstream system access the model was granted. In the LLM context specifically, this includes training data memorized and reproduced verbatim, system prompt contents leaked through adversarial prompting, retrieval-augmented content returned to unauthorized users, and PII or proprietary data transmitted to vendor infrastructure through the input pipeline. The OWASP LLM Top 10 v1.1 treats SID as structurally distinct from traditional data leakage because the model itself can be both the storage medium and the exfiltration channel simultaneously.

The Three Pathways

Chat logs are the most operationally familiar pathway. When an employee submits a prompt containing PII, a contract term, or internal project details, that input travels to the vendor's API endpoint. Most vendors log these inputs for quality assurance, abuse detection, and debugging. Unless the enterprise has negotiated specific contractual carve-outs, those logs persist in vendor infrastructure under the vendor's retention schedule, not the enterprise's. The data didn't leave through a breach. It left through the front door.

Embeddings introduce a subtler problem. Embeddings are numerical vector representations of text that encode semantic meaning and can be stored, retrieved, and, critically, approximately inverted. When an enterprise builds a retrieval-augmented generation (RAG) architecture, it chunks documents and converts them into embeddings stored in a vector database. If that vector store is hosted by an AI vendor or third-party embedding service, those vectors now live outside the enterprise perimeter. Many security teams classify this as a non-event because the data has been "transformed." Research from groups including the Vector Security Lab at Carnegie Mellon (2025) has demonstrated embedding inversion attacks that recover meaningful approximations of source text from stored vectors with accuracy rates exceeding 60% on structured content. Transformation is not anonymization.

Fine-tuning datasets carry the highest-severity risk. Fine-tuning is the process of continuing to train a pre-trained model on a domain-specific dataset, which causes the model to memorize and later reproduce specific patterns from that data. When an enterprise provides proprietary documents, internal communications, or customer records as fine-tuning inputs, those patterns become encoded in model weights. If the fine-tuned model is hosted by the vendor rather than self-hosted, the enterprise's data now lives in infrastructure it doesn't control — and can surface in model outputs under prompting conditions the enterprise didn't anticipate. In multi-tenant deployments with weak tenant isolation, memorized content from one customer's fine-tuning data has been observed appearing in outputs served to other tenants.

The Control Set

Prompt-boundary DLP (data loss prevention applied at the API ingress point) inspects user inputs before they reach the model and redacts or blocks sensitive patterns: PII, credentials, document classification markers. It addresses the chat log pathway directly. It does not address data entering through the fine-tuning pipeline or through RAG document ingestion, which bypass the prompt boundary entirely.

Contractual no-training clauses commit the vendor in writing not to use customer data to train or improve models. This covers the fine-tuning and model improvement pathways. It typically does not cover operational logging (often explicitly carved out), data processed by subprocessors not bound by the same clause, or data ingested before the clause was added to the agreement. Read the subprocessor list before signing.

Tenant isolation refers to the architectural separation that prevents one customer's data — inputs, outputs, embeddings, fine-tuning data — from influencing or becoming accessible to another customer's session. In practice, tenant isolation quality varies significantly across deployment models. Dedicated instances provide stronger isolation than shared inference endpoints. Ask specifically whether fine-tuning is performed on shared or dedicated infrastructure.

Zero-data-retention (ZDR) is where the gap between contractual commitment and technical verifiability becomes consequential.

A contractual ZDR commitment means the vendor has agreed in writing that your data is not stored beyond the immediate processing window. You can audit this through the vendor's SOC 2 Type II report, if the ZDR commitment falls within the audit scope, and through contractual audit rights if you negotiated them. You cannot independently verify that no write operation occurred.

A technically verifiable ZDR commitment means the architecture itself prevents retention — stateless processing with no logging infrastructure, on-premises deployment where the enterprise controls the stack, or cryptographic deletion proofs. Almost no cloud-hosted LLM service offers this in a form that's independently auditable today.

This distinction matters most in regulated environments. FedRAMP High, ITAR-controlled contexts, and IL4/IL5 deployments require data handling controls that can be verified, not merely attested. OMB M-24-10 requires federal agencies to assess AI data handling risks as part of their AI use inventories. A contractual ZDR addendum satisfies a legal requirement. It does not satisfy a technical control requirement. The CISO who signs the addendum and checks the compliance box has done something real — but not the same thing as the CISO who deployed a self-hosted model with no external API calls and a network architecture that enforces the claim.

When ZDR comes up in a procurement conversation, the question worth asking is how you'd verify it, not whether the vendor offers it.

Okta Concept Mapping

The analogy: ZDR claims resemble data residency commitments in early enterprise cloud contracts — a vendor promise about where data lives and how it's handled, enforceable through contract and auditable through third-party attestation. Where it holds: both are contractual constraints that create legal accountability and can be scoped into audit frameworks. Where it breaks: with data residency, you could audit storage locations through SOC 2 scope statements and infrastructure diagrams. With ZDR in LLM contexts, the relevant "data" may have already influenced model weights before the retention window closes — and model weights are not auditable through conventional means. The data can be gone from the log while remaining encoded in the model. That's a failure mode data residency frameworks weren't built to catch.