The Gateway Knows Your App. Does It Know Your User?

By Carey Whitten— May 5, 2026

The Gateway Knows Your App. Does It Know Your User?

What the Identity Layer Actually Is

The identity layer of the AI control plane is the set of mechanisms that authenticate callers to AI services and attach that authentication to a record that governance can act on. It operates at two levels: the application level, where a credential identifies which system is calling, and the user level, where a federated identity identifies which human initiated the request. Most enterprise AI platforms support both. Most enterprise deployments default to the application level and stop there.

Authorization and auditability are downstream of authentication. You can't enforce least privilege on a principal you haven't identified. You can't produce an audit trail that names a user if the request carried only an application credential. The gateway can route, log, and rate-limit all day. None of that resolves the question of who.

How the Two Models Work

Per-application key model. A developer generates an API key from the AI platform's console. The key is associated with a project, a workspace, or an organization account — depending on the platform's terminology. Every request from that application carries the key as a bearer credential, typically in an Authorization header. The platform authenticates the key, applies whatever rate limits or model access policies are attached to it, and logs the request against the application identity.

The result is a log that tells you GPT-4o was called 14,000 times yesterday by your procurement application. Which of the 47 users of that application made which calls is not in the record. The application is the principal. The human is invisible.

Per-user attribution via SSO. The AI platform is configured as a SAML or OIDC service provider. Users authenticate through your identity provider — Okta, Entra, whatever is federating your workforce. The platform issues a session token bound to the user's identity. API calls made within that session carry the user's identity in the request context, and the platform's audit log records it.

Azure OpenAI Service does this natively through Entra RBAC. You assign the Cognitive Services OpenAI User role to individuals or Entra groups. The model call carries the caller's Entra identity. The audit trail names the user. This is the closest thing to true per-user attribution that a major AI platform has shipped at general availability.

Workspace and RBAC as the organizational layer. Both OpenAI and Anthropic have introduced workspace or project constructs that scope API keys to subsets of models and usage limits. These look like cloud IAM resource groups, a logical boundary that contains resources and policies. RBAC within the workspace controls who can administer it, which models are accessible, and what the spend limits are.

“

Okta Concept Mapping

The per-application API key maps cleanly to the OAuth client credentials grant — a non-human principal authenticating with a client ID and secret to obtain an access token. Your team has managed this pattern for years in cloud IAM: service accounts, client registrations, secret rotation policies. The analogy holds for the authentication mechanism itself.

Where it breaks: cloud IAM service accounts typically live in your directory, get provisioned through SCIM or IGA workflows, and can be governed by PAM tooling with automated rotation and access reviews. AI platform API keys are usually issued out-of-band through a web console, stored wherever the developer puts them, and have no lifecycle management unless you build it. The credential exists outside your identity governance perimeter. Okta's API Access Management can issue scoped OAuth tokens to AI gateway endpoints for platforms that accept them, which keeps the credential inside your governed token infrastructure — but this requires the AI platform to accept bearer tokens from an external authorization server, and not all of them do.

The Conversation You're About to Have

Federal agencies procuring AI platforms are starting to ask the per-user attribution question in RFIs. The framing varies: "audit trail," "user-level logging," "identity propagation." The underlying requirement is consistent: if a model produces a harmful or incorrect output, the agency needs to know who asked for it.

When a CAIO raises this in a discovery call, the answer depends entirely on which deployment model the agency chose. Per-application keys: the audit trail stops at the application boundary, and correlation to a human requires application-level logging that may or may not exist. SSO-federated per-user attribution: the audit trail names the user, the session, and the timestamp. This gets set at onboarding. There's no configuration toggle that changes it retroactively.

The workspace and RBAC patterns that AI platforms are building do mirror cloud IAM in structure. But the scope of what they govern is narrower than it looks. Cloud IAM RBAC controls what resources a principal can access. AI platform RBAC typically controls which models a workspace can call and who can administer the workspace. What the model generates, what a downstream agent does with it, sits outside that perimeter. The authorization boundary is around the service. That's a meaningful difference when the capability is generating outputs that affect real decisions.

The workspace is a container. Buyers who reach for the phrase "trust boundary" mean something with considerably more reach.

Know that distinction before Tuesday.