Most AI conversations happening in your accounts right now are about consuming models. Which model, orchestrated how, pointed at what data, governed by whom. The deployment pattern your buyer picks will shape their security posture, their governance burden, and their vendor relationships for years. The model they choose will get swapped out twice before the contract renews.
This section covers those deployment patterns. The trade-offs that surface six months in, when the pilot has to survive an ATO review or scale past a single team.
The complexity ladder
Anthropic published a framework in December 2024 that has quietly become the most stable reference point in a field that doesn't produce many. The core principle is blunt: start with the simplest solution possible, add complexity only when a named problem demands it. Their exact words included the suggestion that this "might mean not building agentic systems at all," which is a remarkable thing for a company selling AI capabilities to put in writing.
The ladder runs like this:
Single LLM call. Prompt in, response out. No tools, no retrieval, no orchestration. This handles more real-world tasks than vendor demos would have you believe.
Augmented LLM. Same single model, but now it can pull in context through retrieval-augmented generation (RAG) or call external tools. This is what most production deployments actually are, even the ones with "agent" in the press release.
Workflow. Multiple LLM calls, orchestrated through predefined code paths. A human decided the sequence. The model does the reasoning at each step; the code decides what step comes next. When your buyer describes their AI initiative and it sounds like a flowchart, you're here.
Agent. The LLM itself decides what to do next, which tools to call, and when it's done. The code doesn't prescribe the path. The model navigates. This is what buyers mean when they say "autonomous," and it's where the governance conversation gets genuinely hard.
Each rung exists because the one below it broke somewhere specific. A single call can't answer questions about your internal documents, so you add retrieval. Retrieval can't handle a multi-step approval chain, so you build a workflow. A workflow can't adapt when the task changes shape mid-execution, so you hand the model control over its own next action.
The ladder is a diagnostic tool. Which rung does the buyer's problem actually live on? Because the market is systematically getting this wrong.
The mispricing
Menlo Ventures surveyed enterprise AI deployments in late 2025 and found that only 16% qualified as true agents. The rest were workflows, routing logic, or single model calls dressed up in agent vocabulary. Their summary was pointed: strip away the hype and most "AI agents" are basic if-then logic around a model call.
Gartner forecast in a June 2025 press release that over 40% of agentic AI projects will be canceled by end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. They identified a pattern they called "agent washing," where vendors rebrand chatbots and RPA tools as agentic without delivering actual agentic capability. Of thousands of vendors claiming agentic solutions, Gartner estimated roughly 130 actually offered genuine agentic features. (Note: the 40% figure is a forecast based partly on a webinar poll, not a measured cancellation rate. The qualitative finding on misapplication is independently supported.)
That's the mispricing. Organizations buy complexity they don't need, label workflows as agents because the market rewards the word, then discover that the governance overhead of a genuine agent has landed on a system that never needed any of it.
The word "agent" hasn't finished settling
Your buyers will use the word freely and mean different things by it. Anthropic's distinction between workflows (predefined code paths) and agents (LLM-directed dynamic processes) is the cleanest available framework. It is not settled law.
Simon Willison observed in late 2024:
"Most of the people who use it seem to assume that everyone else shares and understands the definition that they have chosen to use."
That was eighteen months ago. The situation has improved only slightly. Anthropic itself signaled the pattern's maturation in January 2026 by renaming its Claude Code SDK to the Claude Agent SDK, reflecting that agentic patterns had generalized beyond coding use cases. Google, OpenAI, and Microsoft all shipped their own agent frameworks in 2025. None displaced Anthropic's conceptual vocabulary. The implementation tools multiplied; the definitions stayed contested at the margins.
When a buyer says "we're evaluating agents," your first move is figuring out what they mean. Often they mean a workflow. Sometimes they mean a chatbot with a tool. Occasionally they mean the real thing. The diagnostic question: who decides what happens next, the code or the model?
Where this lands in your accounts
For public sector specifically, each step up the complexity ladder carries a heavier governance burden. A RAG system requires data classification before ingestion. A workflow has a bounded action surface you can document for an ATO package. An agent introduces dynamic tool selection that is genuinely harder to pre-authorize. Federal CIO Gregory Barbaccia described the CAIO Council's emerging framework in almost exactly these terms: the frontier model you choose, the orchestration of how you deploy it, and the data you deploy it against. Three variables, evaluated together.
The pattern choice carries procurement and compliance weight equal to its technical weight. GSA now requires CAIO coordination on any statement of work involving AI. Agencies that piloted AI outside sanctioned SSO, approved data environments, and governance processes are discovering that "pilot failed" is usually a control-plane story — missing SSO integration, ungoverned data access, no audit trail. Your buyer knows this friction even if they haven't framed it in these words yet. When they hesitate on agents, they're often doing the ATO math in their heads.
What this section covers
The pieces that follow walk through each deployment pattern: what it does, what breaks without it, what identity questions it raises, and what you need to hold in a buyer conversation. RAG. Workflows. Tool use. Agents. Multi-agent coordination. Evaluation and observability.
We'll also cover the protocols emerging to connect these systems: Anthropic's Model Context Protocol, Google's Agent-to-Agent protocol, and the open standards forming around them. Federation for AI systems, essentially, where trust boundaries between models and tools get negotiated. They're where identity decisions get made. Or don't.
In identity, OAuth scopes constrain what a client application can request. Agent tool permissions work similarly: they bound what actions the system is allowed to take. The analogy stops bearing weight at one specific point. OAuth scopes don't decide what the client wants to do. An agent dynamically chooses its next tool based on model reasoning, potentially selecting actions the system designer never anticipated. The scope is static. The intent shifts with every reasoning step. When a buyer says "we're deploying agents," the question that earns you credibility is: how are you constraining tool access at runtime when the agent decides it needs a new action mid-task?
A single prompt: who can use the app? RAG: who can access which retrieved data? Workflow: who can trigger each step? Tool use: what credential executes the tool call? Agent: who constrains dynamic tool choice? Multi-agent: how does delegated authority propagate across systems, and how does it terminate? Your IDAM instincts are good through the first three. After that, you're in territory this section will map.
Some of what follows will age. We'll tell you which parts are stable and which parts are still being argued about, because the difference matters when you're across the table from a CAIO who reads the same specs you do.
Things to follow up on...
-
Only 16% are agents: Menlo Ventures' 2025 State of Generative AI report found that most enterprise "agents" are actually workflows or single model calls, with only 16% of enterprise deployments qualifying as true agents by Anthropic's definition.
-
Federal AI use cases doubled: GAO reported that agency-reported AI use cases nearly doubled from 571 to 1,110 between 2023 and 2024, while generative AI cases grew ninefold, a trajectory that makes the deployment pattern conversation increasingly urgent in public sector accounts.
-
Agent washing at scale: Gartner's June 2025 analysis estimated that of thousands of vendors claiming agentic AI solutions, roughly 130 actually deliver genuine agentic capabilities, a ratio worth keeping in mind when evaluating buyer vendor shortlists.
-
The Claude Agent SDK shift: Anthropic's January 2026 decision to rename the Claude Code SDK to the Claude Agent SDK signals that agentic patterns have generalized well beyond coding, which reshapes which buyer personas are entering these conversations.

