The Plumbing
The Plumbing
The Model Doesn't Do Anything

When your buyer says "the AI reads production email," here's what's mechanically happening: a text-completion engine outputs a JSON object requesting the email, and a separate system picks up a credential the model never touches to actually retrieve it. The model proposes. Infrastructure disposes. Every agentic capability is an identity problem wearing an AI hat.
This section maps the components behind that execution layer — the harness, the tools, the credential surface — using Anthropic's four-part decomposition and the application ladder from single prompts through multi-agent systems. Your IDAM intuition is useful here, up to a point. The specific places where it breaks are the ones that matter.

The Model Doesn't Do Anything
When your buyer says "the AI reads production email," here's what's mechanically happening: a text-completion engine outputs a JSON object requesting the email, and a separate system picks up a credential the model never touches to actually retrieve it. The model proposes. Infrastructure disposes. Every agentic capability is an identity problem wearing an AI hat.
This section maps the components behind that execution layer — the harness, the tools, the credential surface — using Anthropic's four-part decomposition and the application ladder from single prompts through multi-agent systems. Your IDAM intuition is useful here, up to a point. The specific places where it breaks are the ones that matter.
When your buyer says "agentic AI," they almost always mean a model that calls tools. Two components do the work, and every identity question you'll face in that conversation depends on knowing which one does what. The model emits a structured JSON request — a tool name, arguments, nothing else — then stops. It executes nothing. It holds no credential. The harness, built by the buyer's engineering team, carries the credential, makes the actual API call, and deals with whatever comes back. This piece walks the single-turn loop phase by phase: who acts, who waits, where the failure modes land, and where the credential lives. Every "who authorized this action" question points to the harness side.

When an AI model "calls a tool," it emits structured text requesting an action. Your orchestration code does the rest. The two dominant formats for that request — native JSON function calling and XML tool calls — differ in portability, debuggability, and validation. They don't differ in what the model can do. If you've ever explained the relationship between SAML and OIDC, you already hold the right mental model. Your instinct that format choice carries security implications comes from identity, where it's well-earned. Here, the security lives entirely in the harness.

Connect five MCP servers to an agent and you've burned 55,000 tokens in tool definitions before anyone types a word. That's two-thirds of some models' working memory, gone, occupied by a catalog of things the agent might need. Anthropic's Skills take the opposite approach: load a one-line description per skill, pull full instructions only when the model decides they're relevant. This piece breaks down the token math, the accuracy data, and why production teams are using both as complementary layers.

Anthropic's Claude Code team built RAG, tested it against agentic search, and switched. That story is circulating in buyer conversations now, often without the context that makes it useful: the win was specific to codebases, the performance benchmark was self-described as "mostly vibes," and for large stable corpora the math reverses entirely. This piece profiles both retrieval strategies with parallel structure, maps them against three data environments your buyers actually operate in, and gives you ten rows of verbatim-usable field language. The practical 2026 answer depends on corpus shape: which approach fits which data, and where identity governance enters the architecture.

Your buyer says "we're building RAG" and within ninety seconds the conversation turns to retrieval. Three mechanisms own that space: keyword search, vector search, and hybrid. Each finds different things well. Each drops different things silently. Vector search smears contract numbers into semantic neighborhoods. Keyword search misses documents that use different vocabulary for the same concept. Hybrid covers both failure modes, with Microsoft's production benchmarks showing roughly 10% higher relevance scores. And underneath all three sits a document-level authorization gap that OWASP already named.

Model Context Protocol is the wire connecting tools to your buyer's AI stack. Over 16,000 MCP servers indexed, every major AI client supporting it, the Linux Foundation hosting it as foundational infrastructure. Adoption is real. What MCP actually governs is narrower than most people assume. The original spec explicitly excluded authentication and authorization. Four revisions in thirteen months have tightened the auth story considerably, but 53% of deployed servers still rely on static API keys. Your OAuth intuition maps cleanly onto parts of this architecture. It breaks in specific, consequential places. Knowing exactly where that break happens is worth five minutes.

Amazon's AI coding assistant retrieved superseded documentation from an internal wiki. No freshness signal, no version check. An engineer followed the advice. Result: 120,000 lost orders. The model reasoned correctly. The infrastructure failed. When a buyer says their agent "started hallucinating," three distinct infrastructure failures are usually hiding behind that word: context rot, instruction centrifugation, and stale-world reasoning. Each has a different root cause, a different fix, and a better model solves zero of them. The diagnostic question that earns you the room, "which layer failed?", is one you can ask Tuesday once you know what the answers mean.

Recap — From Black Box to Plumbing
Nine articles on the mechanical reality behind every "AI-powered" feature your buyers describe. This is the reference version: four layers, the terms that live in each, the sales moment where they surface, and the adjacent concept that causes confusion. Organized for the parking lot before a call, not for a weekend read. One insight holds the whole thing together. An AI agent is a loop: the model emits requests, the harness executes them, credentials flow through infrastructure you can audit. Your IDAM instincts are correct — they just attach to the harness and protocol layers, not the model. Vocabulary mapping tables at the end translate buyer language into identity questions you already know how to ask.
