Who Called the Model? The Identity Layer Your AI Gateway Is Missing

By Carey Whitten— May 5, 2026

Who Called the Model? The Identity Layer Your AI Gateway Is Missing

What the Identity Layer Actually Is

The identity layer of the AI control plane is the mechanism by which a gateway resolves an incoming request to an authenticated principal (a specific user, with a specific role, operating under a specific authorization grant) rather than to an application credential. It sits above the gateway's routing and enforcement logic and feeds it the context that makes enforcement meaningful. Without the identity layer, a gateway can apply policy to traffic. With it, a gateway can apply policy to people.

In practice, three things have to work together: an authentication handshake that ties the request to an identity provider the organization already trusts; a token or session context that carries user attributes through to the AI platform; and a logging mechanism that records not just what was called, but who called it, under what authorization, and when. All three have to be present. A gateway that logs requests without user attribution is doing the first and third without the second, which is the most common failure mode right now.

SSO Integration: The Handshake That Makes Attribution Possible

Enterprise AI platforms have begun implementing SAML and OIDC federation, which means they can participate in the same SSO architecture that governs access to your agency's SaaS estate. When a user authenticates through the organization's identity provider and then accesses an AI platform, the platform receives an assertion or ID token that carries the user's identity and group memberships. The gateway can then attach that identity context to every subsequent API call the user makes within that session.

The practical implication: instead of "application key X called the model 847 times today," you get "user Y, in role Z, called the model 847 times today, with these prompts, returning these outputs." That's the difference between a usage report and an audit trail.

Not every AI platform has implemented this yet. As of early 2026, the major enterprise tiers — OpenAI's enterprise offering, Anthropic's Claude for Enterprise, Google's Gemini for Workspace — support SAML federation to varying degrees of completeness. Microsoft's Copilot ecosystem runs through Entra ID, which is the actual identity provider for M365 environments; if your agency is on M365, the federation architecture is already there. What varies is how much of the user's identity context survives into the AI session and whether the platform exposes it to gateway-level policy enforcement. [Production note: verify current federation support specifics for each named platform against their enterprise documentation.]

Per-User vs. Per-Application Keys: The Architectural Difference That Matters

The simplest way to call an AI API is to issue an application-level key, embed it in your integration, and let every user of that integration share it. This is how most AI deployments start, and it's how a surprising number of them stay. One key, many users, no attribution. The audit log shows the application; the person is invisible.

Per-user key models — or more precisely, user-context token models where the authenticated user's identity is carried through the request — change the architecture fundamentally. Each request is traceable to a specific principal. Scope can be bounded per user rather than per application. Revocation is surgical: you can cut off a specific user's access without rotating the key that every other integration depends on.

The fourteen-teams-on-one-API-key scenario is not hypothetical. It's the default outcome when AI tooling gets adopted bottom-up, which is how most of it arrives in federal civilian agencies. The governance problem surfaces when someone leaves the organization, or when an incident requires you to determine exactly which requests a specific user made. At that point, the application-level key architecture has no answer to give you.

Moving from per-application to per-user models requires SSO integration as a prerequisite. You can't attribute requests to users if users aren't authenticated through a channel the gateway can see. That dependency is why the identity layer sits beneath gateway architecture, not alongside it.

RBAC and Workspace Patterns: Where the IAM Analogy Holds

AI platforms are beginning to implement access control structures that look, from a distance, like the cloud IAM patterns CIOs already manage. Workspaces function as isolation boundaries — different teams, different data access, different model configurations, scoped to a defined membership. Role assignments within workspaces control who can create integrations, who can view usage data, who can modify system prompts. Some platforms have begun exposing these roles through SCIM-compatible provisioning endpoints, which means lifecycle management can be automated rather than manual.

The pattern rhymes with how agencies manage Azure AD groups or AWS IAM roles. A CIO who has spent three years rationalizing cloud IAM boundaries will recognize the shape of it immediately.

Here's where the analogy breaks. Cloud IAM roles govern what a principal can do — which APIs they can call, which resources they can read or write. AI workspace roles govern what a principal can configure, but they don't yet govern what the model can return. A user with read-only workspace access can still receive outputs that contain sensitive information if the model has access to it. The authorization boundary in cloud IAM is enforced at the resource level; in AI platforms, it's enforced at the configuration level, and the model's output sits largely outside that perimeter. This follows from how LLMs work, not from any gap in platform ambition.

A CIO who asks "can I manage AI workspace access the same way I manage my cloud IAM?" deserves a precise answer: you can manage who gets into the workspace the same way. What happens inside the workspace is governed by different mechanisms — data access controls, system prompt design, output filtering — that don't yet map cleanly to the IAM patterns they know.

Okta Concept Mapping

The IDAM anchor: Okta's Workforce Identity Cloud supports OIDC and SAML federation to enterprise SaaS applications, including AI platforms that have implemented those protocols. Where an AI platform supports SAML federation, Okta can serve as the identity provider, carrying group memberships and user attributes into the platform's access control layer. Okta's integration network includes published connectors for several enterprise AI platforms; check the OIN for current availability before citing specifics in a customer conversation.

Where the analogy holds: The SSO handshake works the same way it works for any federated SaaS application. Okta authenticates the user, issues the assertion, the platform receives it, the session is established with user context intact. The governance model the CIO already understands applies to that handshake.

Where it breaks: Okta's governance perimeter ends at the session boundary. Once the user is authenticated into the AI platform, what they do inside that session (which models they invoke, what data the model accesses, what the model returns) is not within Okta's current visibility or enforcement scope. The identity layer Okta provides is necessary but not sufficient for AI governance. Saying otherwise in a customer conversation will cost you credibility with any CIO who has read the fine print.

When a CIO asks about AI attribution, they're usually asking two questions at once: can we prove who used the system, and can we prove what the system did for them. SSO integration answers the first. The second requires controls that sit inside the AI platform itself. Knowing the difference, and being able to name it precisely, is what separates a governance conversation from a sales pitch.