The Accountability Inversion

In 1973, banks could already move money across borders. They'd been doing it for centuries. What they couldn't do was delegate payment instructions at scale, because telex messaging had no standardized way to verify who authorized what, in what format, to which institution. SWIFT didn't introduce new financial capability when it went live in 1977. It introduced a shared record layer — message validation, audit trails, standardized formats — that made delegation trustworthy enough to become institutional. The capability preceded the infrastructure by decades. Scale waited for accountability to catch up.

Something structurally similar is playing out with AI agents right now, and it's producing a pattern that looks counterintuitive until you sit with it.

McKinsey's 2025 survey found fewer than 10% of organizations scaling agents in any individual function. The ones breaking through tend to be compliance monitoring, clinical documentation, financial back-office operations. Heavily regulated, accountability-dense environments. The places you'd expect to move slowest are moving first.

The prevailing assumption runs the other direction: better models unlock broader autonomy, the most technically adventurous domains lead, capability drives adoption. Watch what actually blocks an agent from reaching production, though. Identity gaps, authorization failures, auditability breakdowns. Production deployment requires confirming who the entity is, what it's permitted to do, and whether anyone can later inspect what happened. These are accountability requirements, orthogonal to how smart the model is.

A compliance workflow already has defined authority, audit expectations, escalation paths, and record-keeping obligations. These exist because regulators and counterparties demanded them long before agents arrived. Adding an autonomous actor into that environment is genuinely hard, but the organizational scaffolding has pre-existing attachment points where accountability can take hold. A marketing team experimenting with an agent to manage campaign assets across six platforms has no comparable rails. The agent might be more capable in the second scenario. It remains far less deployable.

Organizations that launched agent pilots without audit trail architecture in 2025 are now discovering they need to rebuild permission and logging systems before they can scale. The work itself is within reach. What's missing is the ability to prove the work was legitimately done. Enterprise procurement committees require complete, queryable records of every agent action. Deployments that cannot produce this documentation cannot pass security review. Capable agents sit idle while the accountability layer gets built underneath them.

Agency law has been working through this for centuries. Delegated authority rests on three things: scope definition, record-keeping, and dispute resolution. Without all three, delegation stays informal, personal, dependent on trust between specific people. With them, it becomes institutional. It scales. SWIFT solved this for interbank messaging. Payment networks are solving it now for agentic commerce, building agent-specific identity tokens and session-scoped authorization before letting software transact on a cardholder's behalf.

The agents are getting capable enough to act. Whether they reach production depends on the environments they act within, and whether those environments can answer a simple set of questions afterward: who authorized this, within what limits, and where's the evidence?

Things to follow up on...

OWASP's agentic risk taxonomy: The OWASP Top 10 for Agentic Applications, published in December 2025, is the first peer-reviewed framework to separate agent-specific security risks from general LLM vulnerabilities, with identity and privilege abuse among the top concerns.
Mastercard's delegation infrastructure: Mastercard's Agent Pay program encodes an agent identifier and session-scope object into every transaction token, making it possible to attribute disputes to a specific software agent rather than just the cardholder.
Authenticated delegation research: A January 2025 MIT/Oxford/Harvard Law preprint proposes extending OAuth 2.0 with agent-specific credentials and argues that natural language prompts are an incomplete scoping, permission, and security tool for production contexts.
The pilot-to-scale blocker: A mid-2026 enterprise deployment report found that organizations which launched agent pilots in 2025 without audit infrastructure are now rebuilding permission and logging architecture before they can pass enterprise security review.