Data Governance: Residency, Retention, and What Leaves Your Perimeter

Separates data residency, sovereignty, and retention for sellers, then proves why contractual zero data retention and technical enforcement produce different outcomes under legal pressure.

By Leigh Garrity— May 8, 2026

Data Governance: Residency, Retention, and What Leaves Your Perimeter

Separates data residency, sovereignty, and retention for sellers, then proves why contractual zero data retention and technical enforcement produce different outcomes under legal pressure.

Your buyer says "data sovereignty" and means three different things in the same sentence. One is about geography. One is about law. One is about time. Until you can hear which one they actually mean, you're having three conversations at once and losing all of them.

This lesson separates the three concepts that get routinely conflated in enterprise AI procurement conversations: data residency, data sovereignty, and data retention. Then it builds toward the distinction that matters most when you're sitting across from a CISO: the gap between a provider's contractual commitment not to retain data and the technical implementation that actually prevents storage from happening.

A court order in 2025 proved these produce different outcomes. We'll get there.

Three Concepts, Three Different Controls

Data residency is where your data physically lives. Country, region, data center. When a prompt leaves your buyer's network and hits an AI provider's inference endpoint, residency answers one question: what geographic location processes and stores that request?

Sounds simple until you look at the options. Azure OpenAI offers three deployment tiers — a Global deployment can route inference across regions for throughput optimization, a Data Zone deployment stays within a geographic boundary (EU or US) but load-balances across countries within it, and only a Regional deployment guarantees single-region processing. OpenAI's API offers regional endpoints across the UK, US, Japan, Canada, South Korea, Singapore, Australia, India, and UAE, with a pricing uplift on newer models. AWS Bedrock processes within the customer's selected AWS region, with data staying in the customer's own account. Anthropic's direct API does not currently publish a comparable regional endpoint matrix. The word "region" means different things depending on which provider said it.

Data sovereignty is whose laws govern that data. The jurisdictional question, and residency alone cannot answer it. You can store data in Frankfurt and still be subject to a U.S. court order if the provider is a U.S.-headquartered company. Microsoft acknowledges this directly in its Q&A documentation: as a U.S. entity, it remains subject to the CLOUD Act regardless of where the Azure resource is deployed. Residency controls geography. Sovereignty follows the corporate charter. Some EU public sector buyers are specifically seeking EU-headquartered providers or sovereign cloud offerings for exactly this reason. When your buyer says "we need the data to stay in Europe," the underlying concern is often "we need it outside U.S. legal reach." Those are different requirements with different solutions, and you should know which one you're solving for.

Data retention is how long the provider keeps your data after processing. When a prompt hits an inference endpoint, is it logged? For how long? By whom? Can that log be compelled by a third party? Retention is where the contractual and technical distinction gets sharp, because "we delete it after 30 days" and "we never stored it in the first place" are fundamentally different postures under legal pressure.

Residency: the physical location where data is processed and stored — geography, not jurisdiction
Sovereignty: the legal jurisdiction governing data, determined by the provider's corporate domicile, not the data center's address
Retention: how long provider systems hold your data after the API response is returned — the time dimension that most buyers forget to specify

The Provider Retention Landscape

Lesson 2 introduced DLP as a gateway function: the control that inspects and filters what data reaches the AI provider in the first place. That's the perimeter question. Everything in this lesson is the post-perimeter question: once a prompt gets through the gateway, where does it go, how long does it stay, and who can compel its production?

Every major enterprise AI provider now makes the same baseline commitment: we don't use your data to train models. At the enterprise tier, this is table stakes. But "not training on your data" and "not retaining your data" are separate controls, and the second is harder to get. The path to it looks different at every provider.

OpenAI API retains abuse monitoring logs for up to 30 days by default (per OpenAI's official platform documentation on data controls). API data is not used for training. Zero data retention requires a separate contractual amendment and prior approval. Regional endpoints are available across nine countries, though non-US residency requires a ZDR amendment.

Anthropic Claude API cut its default retention from 30 days to 7 days on September 14, 2025 (per Anthropic's official API documentation). API inputs and outputs are never used for training. A ZDR addendum is available for qualifying enterprise customers. One wrinkle: Anthropic still retains User Safety classifier results even under ZDR. These are outputs from automated content classifiers, not prompt text itself, but they are metadata derived from analyzing the prompt and could indicate the nature of the query. When Claude is accessed through AWS Bedrock rather than Anthropic's direct API, Bedrock's own data handling governs, not Anthropic's 7-day policy.

AWS Bedrock is the structural outlier. It doesn't store or log prompts and completions by default (per AWS's official security and privacy page). The architecture uses isolated Model Deployment Accounts where model providers have zero access to customer data. If you want logging, you opt in and it goes to your own S3 bucket or CloudWatch. Processing stays within the customer's selected AWS region. No retention is the default.

Azure OpenAI retains prompts and completions for up to 30 days for abuse monitoring. For EU deployments, the EU Data Boundary ensures that customer data is stored and processed in EU/EFTA datacenters, and authorized reviewers are also located in the EEA. Achieving ZDR requires Modified Abuse Monitoring approval under the Limited Access program.

That Azure path deserves its own paragraph. Modified Abuse Monitoring is not a self-serve toggle. It requires an Enterprise Agreement or Microsoft Customer Agreement, a managed Microsoft account team, and a formal application process. Approval is granted per Azure OpenAI resource. When approved, human review is disabled entirely and prompts are no longer stored for abuse monitoring. If your buyer assumes they can flip a setting in the Azure portal and achieve ZDR, they are wrong. You should be the one who tells them.

Provider	Default Retention	Training Use	ZDR Available	Regional Options
OpenAI API	30 days	No (by default)	Yes (contract amendment + approval)	9 countries; non-US requires ZDR
Anthropic Claude API	7 days	Never	Yes (enterprise ZDR addendum)	Not publicly documented for direct API
AWS Bedrock	None	Never (by AWS)	Default is no retention	Customer's selected AWS region
Azure OpenAI	Up to 30 days	No (without consent)	Yes (Limited Access; EA/MCA required)	Global, Data Zone, or Regional deployment

Retention figures are contractual and current-sensitive. Verify against provider DPAs before using in a live conversation.

The Court Order That Made the Distinction Visible

On June 5, 2025, OpenAI COO Brad Lightcap published a blog post responding to a court order in the New York Times copyright lawsuit. U.S. Magistrate Judge Ona T. Wang of the Southern District of New York had ordered OpenAI to preserve all output log data, including chats that users had deleted. The order affected ChatGPT Free, Plus, Pro, and Team users. It affected API customers without ZDR agreements.

It did not affect ZDR customers. Lightcap's explanation was blunt:

“

"Because it is not stored, this court order doesn't affect that data."

ZDR data was protected for a simpler reason: it didn't exist. Nothing to preserve, nothing to produce, nothing to subpoena.

Non-ZDR customers had a different experience. Data that OpenAI would normally have deleted after 30 days was preserved under legal hold from May through September 2025. Prompts that users believed were transient became indefinitely retained because a judge said so. (The preservation order was lifted on September 26, 2025, per an update to the same OpenAI blog post. The proof point is historical, but the legal logic is durable.)

This is the distinction worth internalizing. A contractual commitment says "we will delete your data after N days." A technical implementation says "we never wrote it to disk." The first can be overridden by a court order, a litigation hold, a regulatory demand. The second cannot be overridden because there is nothing to override.

ZDR requires both layers. The contract establishes the obligation. The technical architecture enforces it by making storage not happen. A contract without technical enforcement is a promise. Technical enforcement without a contract is an undocumented behavior that could change on the next release.

One more thing. Even with ZDR enabled, stateful features can break the guarantee. OpenAI's platform documentation is explicit: endpoints like the Assistants API, Threads, file uploads, vector stores, and fine-tuning artifacts may store application state regardless of ZDR status. Azure's Foundry documentation says the same for features like agents, file uploads, and evaluation artifacts. And for OpenAI's newest models (gpt-5.5 and forward), extended prompt caching is mandatory. It stores encrypted key/value tensors on GPU-local storage for up to 24 hours. Technically outside the traditional ZDR scope, even though the data is encrypted and ephemeral. ZDR covers the abuse-monitoring storage layer. It does not override the inherent storage requirements of features that are stateful by design.

Contractual vs. technical: a contract says "we will delete"; a technical architecture says "we never stored" — the OpenAI court order proved these produce materially different outcomes under legal compulsion
Stateful exceptions: ZDR covers abuse-monitoring logs, not stateful features like file uploads, vector stores, fine-tuning artifacts, or mandatory prompt caching on newer models — each maintains its own storage outside ZDR scope

Okta Concept Mapping: Session Policy vs. Session Enforcement

ZDR maps to session lifetime policy versus session revocation enforcement. You configure a session to expire after 8 hours (the contractual layer), but whether the IdP actually invalidates the token and downstream SPs stop honoring cached assertions is the technical enforcement. The analogy breaks here: in IDAM, you manage a discrete token with a defined lifecycle. In AI inference, a prompt touches multiple storage layers — abuse monitoring, classification, caching, stateful features — each with its own retention behavior. There's no single artifact to expire.

When You'll Need This

You're in a meeting with a civilian agency CISO. They're evaluating an AI platform for internal document summarization. The CISO says: "We need to make sure our data doesn't leave the country and isn't used to train anyone's model."

That sentence contains a residency requirement and a training-use requirement. It does not contain a retention requirement or a sovereignty requirement. The CISO probably means all four.

Three questions that move this conversation forward:

"Are you looking for geographic processing guarantees, or also zero retention after inference?" This separates residency from retention. A regional Azure deployment keeps processing in-country but still retains abuse monitoring logs for 30 days unless Modified Abuse Monitoring is approved. Bedrock doesn't retain by default. The answer changes the architecture.

"Does your counsel have a position on provider jurisdiction, or just data location?" The sovereignty gap lives here. If the provider is U.S.-headquartered, CLOUD Act exposure exists regardless of where the data center sits. Some buyers know this and have accepted the risk. Some haven't thought about it. You need to know which conversation you're in.

"When you say 'zero retention,' do you mean the abuse monitoring layer, or are you also thinking about stateful features like file storage and fine-tuning?" This is the question that demonstrates you've read the documentation they haven't read yet. Most buyers haven't thought past the headline ZDR commitment. Naming the stateful-feature exception shows depth that most sellers in the room won't have.

The answers matter less than the questions. Asking the right ones proves you understand the topology of the problem. The CISO has been in meetings all week with vendors who said "we don't train on your data" and assumed that closed the conversation. You're the person who knows that's the floor, and there's a whole building above it.

And when the conversation turns to who authorized the agent to access that system, what credentials it used, whether the access was logged in a way that satisfies audit requirements, your IDAM fluency becomes the center of the conversation. Data governance tells you what happens to the prompt after it leaves. Identity governance tells you who was allowed to send it. The buyer needs both, and you're one of the few people in the room who can connect them.

Practical application: separate the buyer's stated requirement into its residency, sovereignty, and retention components — they will arrive conflated in a single sentence
Your edge: the questions that distinguish "not training" from "not retaining" from "never stored" demonstrate a depth most sellers in the room won't have

The Short Version

Data residency is where. Data sovereignty is whose law. Data retention is how long. Your buyer will say one word and mean all three.

The critical distinction is between contractual and technical enforcement of zero data retention. A contract that says "we delete after 30 days" can be overridden by a court order. An architecture that never writes to disk cannot. The OpenAI court order in mid-2025 made this distinction publicly, legally visible.

Every provider's path to ZDR is different. Some require contract amendments. Some require gated approval programs. One doesn't retain by default. Even with ZDR enabled, stateful features and newer model architectures store data outside its scope. The guarantee is narrower than the marketing suggests, and the buyer who understands the boundaries is the buyer who can actually govern what leaves their perimeter.

Things to follow up on...

OpenAI's expanding residency footprint: OpenAI added at-rest data residency across nine countries starting October 2025, but non-US residency requires a ZDR amendment and carries a 10% pricing uplift on newer models.
Anthropic's consumer vs. API divergence: Anthropic's September 2025 consumer terms update introduced a 5-year retention window for users who opt into model improvement, a policy that does not apply to API or enterprise customers but could confuse buyers who conflate the two tiers.
Stateful features and ZDR incompatibility: A confirmed GitHub issue in OpenAI's agents repo documents that ZDR organizations cannot use the Responses API's stateful conversation management, surfacing a practical constraint that will matter as agentic workflows scale.
Deloitte's sovereignty signal: Deloitte's 2025 State of AI in the Enterprise report found that 77% of enterprises now factor a vendor's country of origin into AI purchasing decisions, suggesting the sovereignty conversation is moving from legal counsel into procurement criteria.