The Stack You Build After Something Goes Wrong

By Carey Whitten— May 5, 2026

The Stack You Build After Something Goes Wrong

Between an employee and a frontier model, six operational layers need to exist. Most organisations build them in the wrong order — after something breaks.

The sequence is structural, not negligent. The layers are not obvious until you need them, and you don't need them until you have traffic. You don't have traffic until you have a use case. You don't have a use case until someone runs a proof of concept. The proof of concept runs on a credit card and a single API key, and it works, and then fourteen teams are using the same key and nobody knows whose budget is paying for it.

That is where most enterprise AI deployments sit right now: somewhere between the credit card and the reckoning.

How the Reckoning Arrives

The reckoning, when it comes, tends to arrive as one of a small set of failure modes. Shadow spend — you receive a bill you cannot attribute to any cost center. PII exposure — you discover that the HR team's resume screener has been sending candidate data to a model with a 30-day retention policy, and your legal team would like to discuss that at their earliest convenience. An audit gap — something went wrong with a model-generated output and you cannot reconstruct what the model was asked, what context it had, or who approved the use case.

Each failure mode generates a layer. The layer gets scoped, procured, and deployed. Then the next failure mode arrives.

The six layers that a mature enterprise AI stack requires are: a gateway, an identity plane, a policy engine, a FinOps framework, a data governance layer, and an observability stack. Each one exists because some organisation learned it needed to exist.

The Six Layers, Named Once

The gateway is the first thing that gets added after the credit card phase, because it is the first thing that breaks. Without a gateway, you have no rate limiting, no routing control, no single point at which to enforce anything else. The gateway is where traffic management begins.

Identity comes next, because once you have a gateway, you immediately discover that you cannot enforce anything without knowing who is calling. Shared API keys are not identities. They are the absence of identity. The identity plane is where you establish that the principal making a request — human or system — is a known, authenticated entity whose permissions can be evaluated.

Policy follows identity, because permissions require an engine to evaluate them. The policy engine is where you define what a given identity is allowed to do: which models, which data sources, which output types, under what conditions. Without policy, identity is just a label.

FinOps, financial operations applied to AI spend, is where you establish cost attribution, chargeback, and budget governance. Zero Data Retention, or ZDR, is a contractual arrangement with a model provider that prevents your data from being used in training; FinOps is the layer that tracks whether you are actually paying for ZDR on the workloads that require it, and whether the teams consuming the most tokens have budget authority to do so.

Data governance determines what data can reach the model at all. This is where Data Loss Prevention, or DLP, controls live — the rules that prevent a user from pasting a customer record into a prompt, or that strip personally identifiable information before a document reaches the model's context window. Data governance and policy overlap at the edges, but they operate on different objects: policy governs what an identity can do, data governance governs what data can become.

Observability is the last layer most organisations add and the first one they wish they had built earlier. It is the logging, tracing, and alerting infrastructure that lets you reconstruct any model interaction after the fact. When the audit gap failure mode arrives — and it will — observability is the only answer.

“

IDAM Bridge — In identity, a trust boundary marks the edge of a domain you control: you authenticate principals, you define resources, you enforce policy at the crossing point. The closest AI equivalent is the boundary between your enterprise stack and the frontier model. It diverges here: in a traditional trust boundary, both sides are systems you can inspect and attest. The frontier model is a system you cannot inspect — you are trusting a vendor's published data handling terms, not a technical control you operate. That changes the nature of every enforcement decision you make on your side of the line.

What This Chapter Does

These six layers are not a checklist that a thoughtful CIO designs before the first API call. They are the architecture that emerges from six rounds of "we should have had this before that happened." The goal of this chapter is to compress that learning curve: to give you the mental model before the failure modes, rather than after.

Each subsequent lesson takes one layer and opens it up — what it does, how it gets implemented, what the common failure patterns look like at scale, and what a mature deployment looks like versus a reactive one. By the end of the chapter, the block diagram should be populated.

Right now, the names are enough.