Same Weights, Different Envelope

By Leigh Garrity— May 9, 2026

How hosting location determines jurisdiction, logging, breach surface, and whether your agentic workflows finish before the meeting ends

Four things change when a model moves from one hosting environment to another: the legal jurisdiction governing data requests, the logging posture that determines who can produce records after an incident, the breach surface that determines what an attacker who compromises the provider can actually see, and the round-trip latency that determines whether an agentic workflow completes in seconds or minutes. The model weights don't change. The math is identical. The legal and security envelope around the inference is a different matter entirely.

You'll encounter these four dimensions at different moments in a federal buyer conversation. Jurisdiction surfaces in procurement, usually from legal or the contracting officer. Logging posture surfaces in security review, usually from the ISSO or the incident response team. Breach exposure surfaces when the CISO asks the question they always ask: "What's the blast radius if the provider gets hit?" Latency surfaces last, usually in a technical review, and it's the one that surprises people most — because the relevant latency isn't between the model and the user. It's between the model and the tools it calls. For agentic workflows, that distinction is the whole ballgame.

Precise language in each of these conversations buys you credibility with the specific stakeholder in the room. "The model is the same file everywhere; what we're evaluating is the legal and security envelope around the inference" is a sentence that lands differently than "it's secure." The first one tells the buyer you understand what you're talking about. The second one tells them you don't.

The Four Dimensions

Legal Jurisdiction and Data Sovereignty

What it is: The legal framework that governs who can compel the provider to disclose data processed during inference.

What it does: Determines which government can issue a lawful order requiring the provider to produce prompts, completions, fine-tuning data, or usage logs. For federal buyers, this is often the first question and the one that surfaces earliest in procurement. The relevant instrument in the US context is the CLOUD Act (2018), which allows US authorities to compel US-incorporated providers to produce data regardless of where that data is physically stored. EU-based providers operating under GDPR face the inverse constraint: they may be legally prohibited from complying with certain US requests. For a federal agency deploying AI in a multinational context, these two legal regimes can pull in opposite directions simultaneously.

Who controls it: The provider, through where they incorporate and where they operate infrastructure. The customer controls it only insofar as they choose a provider whose legal domicile and infrastructure location align with their requirements. Signing a data processing agreement doesn't change jurisdiction; it documents what the provider will do within the jurisdiction that already applies.

What makes it distinct from the other three dimensions: The risk here comes from authorized access. A lawful data request isn't a breach. The provider may be legally required to comply and legally prohibited from notifying the customer. The other three dimensions are about unauthorized access or operational performance. Jurisdiction is about what happens when the system works exactly as designed.

Audit and Logging Posture

What it is: The completeness, retention period, and accessibility of records documenting what the model processed and when.

What it does: Determines whether you can reconstruct an incident. What prompt was sent. What completion was returned. What tools were called. What data was accessed. For federal buyers, this connects directly to FISMA incident response requirements and, for some agencies, to records retention obligations under the Federal Records Act. The gap between "logs exist" and "logs are useful" is enormous in AI deployments — a provider can have extensive logging infrastructure and still not produce records in a format that's useful for forensics, or retain them for a period that matches the customer's legal hold requirements.

Who controls it: Varies by hosting scenario, and buyers frequently underestimate their exposure here. In a fully managed API scenario, the provider controls log retention, format, and access. The customer gets what the provider decides to surface. In a cloud-hosted scenario (Azure OpenAI, AWS Bedrock), the cloud provider's native logging tools apply, but the customer has to configure them — and the default configuration is often not sufficient for federal incident response requirements. In a self-hosted scenario, the customer controls everything, including the obligation to build and maintain the logging infrastructure.

What makes it distinct: Accountability is genuinely ambiguous in ways the other dimensions aren't. For jurisdiction, it's clear: the provider's legal domicile determines who can compel disclosure. For breach exposure, it's clear: the provider controls the infrastructure. For latency, it's clear: the customer controls hosting choice. For logging, the question "who's responsible for producing records if something goes wrong?" often doesn't have a clean answer until after something goes wrong.

“

Okta Concept Mapping — System Log Analog

Okta's System Log produces a documented, schema-stable record of every authentication and authorization event, with configurable retention and export to SIEM. Federal customers use it as the authoritative source for FISMA incident response. The analog in an AI deployment is the model provider's inference log — but unlike Okta's System Log, AI provider logs vary significantly in schema, completeness, and retention period across providers and even across product tiers within a single provider. The gap matters when an incident response team needs to reconstruct what the model processed during a specific window. Before a buyer signs with a provider, the question to ask is: "Can you show me a sample inference log entry, and what's your documented retention period?" If the answer is vague, that's the answer.

Breach Exposure

What it is: The data an attacker can access if they successfully compromise the hosting provider's infrastructure.

What it does: Defines the blast radius of a provider-level breach — not a breach of the customer's own systems, but a breach of the provider. The model weights themselves are rarely the sensitive asset. The sensitive assets are the prompts the customer sent, the completions the model returned, the fine-tuning data the customer provided, and the inference logs that document all of the above. In a shared multi-tenant inference environment, the additional question is whether a breach of one tenant's data exposes another tenant's data.

Who controls it: The provider controls the architecture. The customer controls what they send to the model — which is the primary lever they have over their own exposure. A customer who sends de-identified queries to a shared API endpoint has a materially different breach profile than a customer who sends raw PII-containing documents. This is a design decision, not a procurement decision, and it's one that most buyers haven't made explicitly before the security review.

What makes it distinct: Multi-tenancy is the variable that doesn't appear in the other three dimensions. Jurisdiction applies regardless of whether the infrastructure is shared. Logging posture applies regardless of whether the infrastructure is shared. Latency is indifferent to tenancy. Breach exposure is where the question "are other customers' data and my data on the same infrastructure?" has direct consequences. Most providers don't publish enough detail about their inference architecture to answer this definitively. What you can evaluate: whether the provider offers dedicated inference endpoints, whether those endpoints are documented as isolated from shared infrastructure, and whether that isolation is within the authorization boundary of any relevant compliance certification.

“

Okta Concept Mapping — Tenant Isolation

Okta's multi-tenant architecture provides logical separation between customer tenants: a breach of one tenant's credentials doesn't expose another tenant's directory data. The architecture is documented and the isolation model is part of what FedRAMP authorization covers. The question for AI providers is whether their multi-tenant inference infrastructure provides equivalent isolation — and the honest answer is that most providers' inference isolation is less thoroughly documented than their storage isolation. The analogy holds for framing the question; it breaks when you expect the same level of published architectural detail. In a buyer conversation: if the CISO asks about tenant isolation, you can use the Okta framing to make the concept legible, but don't imply the answer is the same.

Agent Round-Trip Latency

What it is: The cumulative network delay introduced by the distance between the model and the tools or data sources it calls during an agentic task.

What it does: Determines whether an agentic workflow completes in a useful timeframe. This is not the latency between the model and the user — that's the number that appears in benchmark comparisons and marketing materials. The relevant latency for agentic deployments is between the model and the APIs, databases, and services it calls to complete a task. A single round-trip at 80ms versus 8ms is imperceptible. Twenty round-trips at 80ms versus 8ms is 1.6 seconds versus 160ms of pure network overhead, before any inference time. A 40-step task — not unusual for a complex document review or a multi-system lookup — doubles that again.

Who controls it: The customer, through hosting choice and architecture. Most buyers are missing this: the model should be close to the tools, not close to the user. A federal agency whose data sources are in a GovCloud region and whose model is hosted in a commercial region is adding cross-region latency to every tool call the agent makes. The user experience is fine. The agentic workflow is slow, and it gets slower as task complexity increases.

What makes it distinct: Latency is operationally invisible until you're in production. Jurisdiction and logging posture are evaluated during procurement. Breach exposure is evaluated during security review. Latency doesn't surface until someone runs an agentic workflow at scale and notices that tasks that should take 30 seconds are taking 4 minutes. By that point, the hosting decision has already been made.

“

Okta Concept Mapping — AI Feature Hosting

Okta's AI-powered features — Identity Threat Protection, AI-assisted access reviews, and related capabilities — run inference in Okta's infrastructure. The hosting location of that inference is Okta's architectural decision, not the customer's. For customers in FedRAMP High environments, the question of where Okta's AI inference runs is the same question as where any other AI inference runs: jurisdiction, logging posture, breach exposure, and latency all apply. If a federal buyer asks whether Okta's AI features are within the FedRAMP authorization boundary, the answer requires checking the current authorization package — not assuming that "Okta is FedRAMP authorized" covers every feature. "It's FedRAMP authorized" is not a complete answer to any of the four dimensions.

Comparison Strategy: Scenario Mapping

This section uses scenario mapping — organizing the four dimensions by buyer conversation context — because the dimensions don't cluster by shared trait; they cluster by who's asking the question and when.

The four dimensions surface at different moments in a federal procurement cycle, with different stakeholders, and the dimension that dominates the conversation depends on who's in the room.

Procurement and legal review: Jurisdiction dominates. The contracting officer and legal counsel are asking whether the provider's legal domicile creates disclosure risk. Logging posture is secondary — what records will exist if the agency needs to produce them in litigation or an OIG investigation? Breach exposure and latency are not yet on the table.

Security review and CISO meeting: Breach exposure dominates. The CISO wants to know the blast radius of a provider compromise. Logging posture is immediately secondary — can we reconstruct what happened? Jurisdiction surfaces as a third question: if the provider is breached and we need records, who else can compel disclosure of those same records? Latency is not yet relevant.

Architecture and technical review for agentic use cases: Latency dominates, and it's the one that surprises technical buyers most. The architect who has been thinking about model performance in terms of tokens-per-second and time-to-first-token hasn't necessarily thought about cumulative round-trip latency for a 30-step agentic workflow. Breach exposure resurfaces here in a different form: the agent has access to tools and data sources, and the question of what an attacker who compromises the model's hosting environment can see now includes everything the agent can reach.

Know which dimension is live before you walk into the room. A procurement conversation where you lead with latency is a conversation that's going to go sideways. A CISO meeting where you lead with jurisdiction before you've addressed breach exposure is a meeting where you've misread the room.

How to Say This in the Field

Don't say	Do say	Why it matters
"The AI is secure."	"The model weights are identical across hosting scenarios. What we're evaluating is the legal and security envelope around the inference — jurisdiction, logging, and breach surface."	The first sentence is meaningless. The second one tells the buyer you know what you're talking about.
"Your data is protected by encryption."	"Encryption covers data in transit and at rest. It doesn't address who can compel the provider to produce inference logs, or what an attacker who breaches the provider's infrastructure can access."	Encryption is table stakes. Buyers who ask about security want to know about the things encryption doesn't cover.
"It's FedRAMP authorized."	"The hosting infrastructure is FedRAMP authorized. You'll want to verify that the specific model endpoint is within the authorization boundary — not every AI feature on a FedRAMP-authorized platform is automatically in scope."	"FedRAMP authorized" is a property of a specific authorization boundary, not a blanket property of a provider.
"The model doesn't store your data."	"The model doesn't use your prompts to train future versions — that's a separate question from whether the provider retains inference logs, and for how long."	These are two different things. Buyers conflate them. Separating them is how you demonstrate precision.
"We can move it to a different region if needed."	"Region affects latency to your tools and data, not just latency to your users. For agentic workflows, the model needs to be close to the APIs it calls, not just close to the person running the task."	Most buyers are thinking about user-facing latency. The latency that matters for agents is tool-facing.
"The provider handles security."	"The provider handles infrastructure security. You control what you send to the model — that's the primary lever you have over your own breach exposure."	Buyers need to understand they have agency here, and it opens the conversation about data minimization in prompts.
"It's hosted in the US."	"US hosting covers jurisdiction for most federal requirements. You also need to verify logging posture — what records exist, who can produce them, and what the retention period is."	Jurisdiction is one of four dimensions. Answering one doesn't answer the others.
"Agents are just automation."	"Agents make dozens of API calls per task. Each call adds network latency. A model hosted 80ms from your data sources adds over a second of pure network delay to a 20-step task — before any inference time."	This reframes latency as an architecture concern, not a performance benchmark. That's the framing that lands with technical buyers.
"We'll figure out the compliance details in implementation."	"Jurisdiction and logging posture are procurement decisions. Once you've signed with a provider, your options for changing the legal envelope are limited."	True, and the buyer needs to hear it before they sign.
"It's the same as using any cloud service."	"The inference log is different from a typical cloud service log. You need to verify what the provider retains, in what format, and whether it's accessible to you for incident response."	AI inference logs are not the same as storage access logs or API gateway logs. The comparison misleads.
"Our solution is compliant."	"The hosting infrastructure meets the authorization requirements. The compliance posture for the AI features specifically depends on which endpoints are in scope — that's worth confirming with the provider before procurement."	"Compliant" is a property of a specific configuration relative to a specific control set, not a property of a product.
"Latency won't be an issue."	"For single-turn queries, latency is fine. For agentic workflows with multiple tool calls, the model's proximity to your data sources matters more than its proximity to your users — and that's worth checking against your architecture before you commit to a hosting region."	Precise and honest. That combination earns trust with technical buyers faster than confidence does.

Slot 4.4 covers the four hosting scenarios themselves. This piece covers what changes across them — and what doesn't.