Federal AI Pilots Keep Dying at the Authorization Boundary. Here's the Architectural Reason.

Federal AI pilots stall at the authorization boundary because they were built outside the identity and compliance envelope. Here's the discovery angle.

By Leigh Garrity— May 8, 2026

Federal AI Pilots Keep Dying at the Authorization Boundary. Here's the Architectural Reason.

Federal AI pilots stall at the authorization boundary because they were built outside the identity and compliance envelope. Here's the discovery angle.

You already have an explanation for why federal AI pilots stall before production. Organizational inertia. Risk-averse leadership. Procurement friction. The general gravitational pull of the status quo. That explanation is real. It describes forces you've watched slow every modernization effort you've ever sold into.

It also stops at the surface. The deeper cause is architectural, and it maps directly to a gap your buyer probably hasn't named yet.

The shape of the stall

Brookings research shows nearly 60% of federal AI use cases remain in pilot or pre-deployment status. Read that carefully: it reflects agency-reported use cases, not independently verified deployment outcomes. What it tells you reliably is the shape of the problem. A majority of AI efforts that got funded, staffed, and started have not crossed into production. Something is killing them in the middle.

SAP's Zamanzada has called this the "pilot trap," speaking at Federal News Network's AI & Data Exchange 2026, and the label is useful because it names the dynamic precisely: agencies can start pilots, but the path from pilot to production has a structural break in it. Conventional wisdom puts that break on politics or organizational friction. The architectural cause underneath those forces is what your buyer probably hasn't named yet.

The pattern you already recognize

An agency team gets funding and top-cover to run an AI pilot. They stand up an environment: a cloud sandbox, a vendor trial instance, something a contractor built on a separate infrastructure stack. The pilot works. Leadership is impressed. Someone says "let's take this to production."

Then the pilot hits the authorization boundary.

No enterprise SSO integration. No SCIM-based provisioning tied to the agency's identity store. No audit logging that satisfies ATO requirements. The pilot was built as a standalone environment, outside the agency's identity and compliance envelope. Moving it inside that envelope isn't a configuration change. It's a re-architecture.

You've seen this exact trajectory before. Every shadow IT tool that ever gained traction inside an agency before someone asked whether it had an ATO followed the same arc. A useful capability gets adopted outside the authorization boundary, builds momentum, and dies when the cost of bringing it inside becomes visible.

AI pilots are shadow IT with executive sponsorship. The sponsorship gets them started. The gate review is a different conversation entirely.

The gate review is the mechanism

HUD's M-25-21 compliance plan makes the mechanism concrete. High-impact AI components must pass a gate review prior to deployment. If a system doesn't meet minimum risk management requirements, it won't be authorized. HUD is also updating its ATO process to incorporate AI-specific considerations. The language is forward-looking, which tells you something important: existing ATO processes weren't built for AI, and agencies are retrofitting them now.

A pilot built without identity controls, without compliant audit logging, without the access governance infrastructure the ATO process requires, will be stopped at the gate. The AI use case has support. The system can't demonstrate it meets the controls.

Alpha Omega, a federal IT consulting firm that works these implementations (and has a commercial interest in the "build it right from the start" approach, so weight accordingly), describes the failure mode from the practitioner side: security reviews stretch, momentum fades, the pilot stalls. Their diagnosis: architecture determines survivability. Their counter-example is instructive. They built an AI capability inside an existing enterprise application that already had identity controls, logging, and compliance enforcement inherited from the tenant. The pilot became the production system because there was no re-architecture required.

The architectural test

If the pilot inherits the compliance envelope from day one, the path to production is a policy decision. If it doesn't, the path to production is a construction project. In federal environments, where obtaining an ATO can take six to eighteen months, that construction project often outlasts the pilot's political window.

The shadow IT analogy, and where it runs out of road

Traditional shadow IT hits the ATO wall once. You bring the tool inside the boundary, it gets assessed, it gets authorized or it doesn't, and you move on. The assessment is a point-in-time snapshot of a relatively static system.

AI systems aren't static. Models get updated. Agents chain tools together across multiple systems, each with its own permissions model. Training data changes. The risk profile shifts over the system's lifecycle in ways a point-in-time ATO wasn't designed to capture.

The U.S. Chamber of Commerce made this argument directly to OSTP: traditional federal ATOs rely on static assessments incompatible with the dynamic nature of AI systems. Their recommendation: move from point-in-time compliance to continuous monitoring. (A policy advocacy document, not regulation. But the logic is sound, and the direction is consistent with what agencies are already doing.)

Continuous authorization requires continuous identity visibility. You can't continuously monitor access controls you never instrumented. A pilot built without SSO integration, without SCIM provisioning, without centralized audit logging doesn't just need those things bolted on for a one-time ATO. It needs them operating persistently, feeding a continuous compliance posture. The re-architecture cost is an ongoing operational requirement, and the pilot was never designed to support it.

The shadow IT analogy breaks right here. Bringing shadow IT inside the fence was a one-time construction project. AI systems require continuous proof of residency, because the system itself keeps changing shape.

DHS is already moving this direction

In June 2025, DHS prohibited commercial generative AI tools like ChatGPT and directed staff to use internal alternatives. CIO Antoine McCord instructed component technology offices to restrict access and retired earlier guidance that had conditionally permitted commercial tools. The motivations were likely multiple: data leakage, classification risk, supply chain trust. But the action is consistent with the architectural pattern. Tools outside the DHS identity and compliance envelope are no longer permitted. The internal alternative operates within governed channels.

DHS's broader AI strategy goes further. The agency is shifting to continuous authorization for AI, with use cases assessed and classified early in the lifecycle and monitored continuously. Ogletree's analysis of federal AI strategy plans is worth reading: their core point is that AI must operate within existing security and compliance boundaries, and contractors who build outside those boundaries will find their solutions stuck at the gate. (Ogletree is a labor and employment law firm with a government contracts practice. Their analysis is legally informed and practically grounded, though written for a contractor audience.)

One honest caveat: no published DHS or CISA guidance explicitly names real-time identity monitoring, SSO session tracking, or SCIM sync as specific requirements for AI systems under continuous authorization. The connection is logically direct. Continuous authorization requires continuous visibility into who and what has access. But the named-control language a seller would want to quote isn't in the public documents yet. Treat this as directional, not quotable.

What FedRAMP 20x leaves on the table

OpenAI received FedRAMP 20x Moderate authorization for ChatGPT Enterprise and API Platform in late April 2026. Significant milestone. But deployment remains "subject to each agency's policies and authorization decisions." FedRAMP authorization gives agencies reusable security evidence. Agency-level ATO is a separate gate entirely.

A pilot built on a FedRAMP-authorized platform still needs to demonstrate it meets the agency's identity, access, and audit requirements. Without those integrations baked in, FedRAMP doesn't save it from the re-architecture cost. Your buyer lives inside a two-layer authorization structure, and the layer they control is the one where pilots die.

Where Okta fits

Okta's Identity Governance is FedRAMP High authorized for Okta for Government High and eligible Moderate customers. The SSO, SCIM provisioning, and governance capabilities a pilot needs to inherit the compliance envelope are available at the authorization level federal agencies require.

Okta for AI Agents, generally available April 30, 2026, extends this to treat agents as first-class identities: discovery of unsanctioned agents, centralized audit logging, governance workflows, and an Agent Gateway that enforces least-privilege access across tool and API interactions. This matters for a specific reason: AI agent workflows chain multiple systems together, and without centralized identity control, audit visibility breaks across the chain.

One thing I can't confirm from public sources: whether Okta for AI Agents carries a separate FedRAMP listing or falls within the existing Okta for Government High authorization envelope. Your Okta contacts can clarify for specific accounts.

The discovery questions this gives you

The central question for your next agency conversation: Did your AI pilot go through your authorization process, or was it built as a standalone environment?

If standalone, the follow-ons are mechanical. What's the ATO timeline, and who's the authorizing official? Does the pilot have SSO integration with the agency IdP, or are users authenticating separately? Is provisioning SCIM-connected to HR, or manual? What audit logging exists, and does it feed the agency's SIEM? If agents are involved, who owns the agent's credentials, and what happens when a human owner leaves the agency?

If the pilot did go through authorization, the questions shift toward durability. Was it authorized under a traditional point-in-time ATO, or is it under continuous authorization? When the model gets updated or an agent's toolchain changes, does that trigger a new security review or is the monitoring infrastructure already capturing it? Can the agency's identity platform see what the AI system is accessing in real time, or are they relying on periodic access reviews?

Every one of these connects the buyer's AI ambition to the identity infrastructure that determines whether the pilot survives. The authorization boundary is where federal AI pilots actually die. Your buyer may not have named the cause yet.

Now you can name it for them.

Things to follow up on...

OMB's new CIO mandate: M-26-10 requires agency CIOs to submit contract data for all IT purchases, including AI, to OMB starting May 2026 — essentially FITARA enforcement applied to AI spend, which will surface pilots that bypassed CIO oversight.
85% missing risk documentation: Brookings found that more than 85% of high-impact deployed federal AI use cases in 2025 lack required information about risk mitigation measures, despite explicit OMB requirements — a governance gap that directly implies incomplete access control and audit documentation.
Shadow AI authentication data: The Verizon 2025 DBIR found that 72% of employees using generative AI at work authenticate with personal, non-corporate email accounts, with only 11% going through governed corporate channels — the same authentication bypass pattern that kills pilots at the ATO boundary.
FedRAMP 20x going default: The FedRAMP 20x program is projected to become the default authorization path for new cloud offerings starting Q3 2026, which will accelerate platform-level authorization but won't resolve the agency-level identity integration gap this piece describes.