AI deployment patterns form a spectrum. Where you land on it should be a decision, not a default.
At one end: a single prompt, a model, a response. Stateless, bounded, auditable. At the other end: autonomous agents that plan, act, and loop — calling tools, spawning subagents, and making decisions without a human in the loop for each step. Between those poles sits every AI architecture your buyers are currently proposing, funding, or trying to govern.
Each pattern carries a different price tag. The price isn't always dollars — it's latency, failure surface, debugging complexity, and governance overhead. Every step up the complexity curve adds a new place where the system can fail in ways that are harder to predict and harder to explain after the fact.
Complexity is a tax. Sometimes the problem is worth paying it. Often it isn't, and the team proposing the complex solution hasn't done the math.
What the Tax Actually Costs
A single prompt costs almost nothing to reason about. Input goes in, output comes out. If it's wrong, you know immediately and you know why. The audit trail is trivial. The failure modes are bounded.
Add retrieval — pulling relevant documents into the context before the model responds — and you've bought accuracy on knowledge-intensive tasks. You've also introduced a retrieval layer that can fail, return stale content, or surface the wrong documents. The system is now two things that can break instead of one.
Add tool use, and the model can act on external systems. Now you have an authorization question: what is this model allowed to do, on whose behalf, and what happens when it does something unexpected? Authorization questions are governance questions, and governance questions need owners.
Add agents that plan and loop, and the failure surface expands further. The model is now making sequences of decisions, each of which depends on the last. Errors compound. Debugging requires reconstructing a chain of reasoning you didn't write and can't fully inspect.
Agents are expensive, and the expense needs justification — a problem that simpler patterns genuinely can't solve.
What This Chapter Covers
Eight patterns, moving roughly from simple to complex. A ninth piece closes the chapter with a decision framework for choosing among them, because the patterns are only useful if you know when to reach for each one.
Prompting and Context Engineering. The foundation. What goes into the context window determines what comes out. More sophisticated than it sounds; less complex than everything that follows.
Retrieval-Augmented Generation (RAG). Connecting a model to a knowledge base at inference time. The standard answer to "the model doesn't know our data" — with its own set of tradeoffs that the standard answer usually omits.
Fine-Tuning vs. Prompting. When to bake knowledge into the model itself rather than providing it at runtime. A different cost structure, a different set of failure modes, and a procurement decision that's harder to reverse than it looks.
Workflows vs. Agents. The architectural fork that matters most in enterprise AI right now. Workflows are deterministic sequences; agents are autonomous planners. The distinction has governance implications that most architecture discussions skip entirely.
Tool Use, Function Calling, and MCP. How models interact with external systems. Model Context Protocol (MCP) is the emerging standard for that interface layer, and it's already showing up in federal procurement conversations.
Evals and Observability. How you know the system is working. Build this in from the start, or retrofit it at cost later. The teams skipping this step are the ones calling you six months later.
Multi-Agent Patterns. Multiple models coordinating on a task. The complexity ceiling of this chapter, and the pattern where identity and authorization questions get genuinely hard to answer with existing tooling.
Choosing Your Pattern. The closing framework. Given a problem, a risk tolerance, and a governance environment, which pattern fits?
IDAM Bridge — In identity, a session has a defined scope and a defined lifetime. You know what the authenticated principal can access and when that access expires. The closest AI equivalent is a prompt-response interaction: bounded, stateless, one principal, one request. It diverges when you introduce agents. An agent's "session" is open-ended — the scope expands as the task unfolds, the lifetime is determined by task completion rather than a timer, and the principal making decisions at step seven may be acting on context accumulated across six prior steps that no human reviewed. The governance model for a bounded session doesn't transfer cleanly. Agents need a different trust architecture, and most of the frameworks being sold right now don't have one yet.
What You'll Be Able to Do
After this chapter, you can place any AI initiative on the spectrum and ask the question that actually matters: is the complexity justified by the problem, or is it complexity for its own sake?
That question lands differently in a buyer conversation than asking whether the architecture is "sophisticated" or "enterprise-ready." It signals that you understand the tradeoffs. Technical buyers notice.
The patterns follow. Start with the simplest one.

