Seven lessons. Every AI deployment pattern from a single prompt to multi-agent coordination. This recap doesn't re-teach any of it. It organizes what you read into a decision structure and two mapping tables worth pulling up before a call.
Use the simplest pattern that works. Add complexity only when you can name what broke.
The Decision Tree
Four questions. Each one resolves your pattern choice or pushes you to the next.
Q1: Is the knowledge already in the model? YES → A single LLM call may be enough. Optimize the prompt, add in-context examples, test before adding infrastructure. NO → Add retrieval. This is RAG.
Q2: Is the path predictable? YES → Build a workflow. The five sub-patterns (prompt chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer) give you options within this branch. Lesson 4 covers sub-pattern selection. NO → Move to Q4.
Q3 isn't sequential. It applies at every level of the tree.
Q3: Does it need tools? YES → Add function calling or MCP servers. A single LLM call can use tools. A workflow step can use tools. An agent uses tools by definition. Think of this as a modifier on your pattern, not a separate branch. NO → The model generates text only. No external actions.
Q4: Does it need to adapt mid-task? YES → You need an agent loop: plan, execute, observe, replan. If a single agent isn't enough, measure the failure mode before adding multi-agent coordination. NO → Stay with a workflow. Fixed paths, predictable outputs, lower operational cost.
Straddle case: Agentic RAG spans Q1 and Q4. Retrieval addresses missing knowledge; the agent loop lets the system judge results, reformulate queries, and retrieve again. As Lesson 2 covered, this is where RAG stops being a simple pipeline and starts requiring agent-level governance.
Where fine-tuning fits: It lives outside this tree entirely. Fine-tuning answers "which model?" — a separate question from "which pattern?" It changes the model's baseline behavior before deployment. The practitioner hierarchy from Lesson 3: optimize prompts first, add RAG second, fine-tune only when style or cost demands it.
If you remember nothing else: Start with the simplest pattern. Escalate only when you can point to the specific thing that failed.
Vocabulary Collisions
Five terms that mean one thing in your IDAM experience and something different in AI conversations. Your fluency with these words is an asset right up until it becomes a liability. The last column marks the moment.
| AI Term | What It Means in AI | IDAM Equivalent | Key Divergence |
|---|---|---|---|
| Token | Chunk of text (~4 characters). Counted for cost and context limits. | Access token, ID token, refresh token. A security artifact conveying authorization. | An LLM token carries zero authorization. When a buyer says "token budget," clarify: cost/context tokens, or bearer tokens? |
| Context | The model's working memory for one request. Assembled dynamically; can be truncated, summarized, or reinjected by the application. | Security context: authenticated session state, token-bound identity environment. | Ephemeral. Gone when the context window closes. Authorization should never assume the model "remembers" prior constraints unless the system re-supplies them outside the model. |
| Session | Conversation thread or context carried across turns. May be rehydrated from summaries or vector stores. | Authenticated session with defined lifetime, cookie/token binding, policy controls. | An AI "session" has no guaranteed auth state. Continuity can be simulated by context engineering without any underlying auth continuity. The word sounds the same; the guarantees are absent. |
| Agent | LLM-centered system that dynamically selects tools and actions to pursue a goal. | Identity agent (endpoint software), service account, non-human identity. | LLM agents are non-deterministic, tool-selecting, and prompt-sensitive. A service-account model covers ownership and lifecycle but misses intent, tool chaining, and runtime behavior entirely. |
| Scope | Task boundary, agent authority, tool permission, or blast radius. Varies by speaker. | OAuth scope: coarse authorization string requested by client, enforced by resource server. | OAuth scopes say nothing about whether a tool invocation is appropriate for the user's intent or the agent's current plan. Required for agent governance. Still leaves gaps. |
If you remember nothing else: These five words are where your IDAM fluency can actively mislead you. Know where the meaning forks.
Pattern-to-IDAM Mapping
Each AI pattern has a closest IDAM analogy. The analogy orients you. The last column marks the cliff where it stops bearing weight.
| AI Pattern | What It Does | IDAM Analogy | Where the Analogy Breaks |
|---|---|---|---|
| Single LLM call | One prompt in, one completion out. | Deterministic API call. | Outputs are probabilistic. Same input ≠ same output. Authorization logic built for deterministic responses doesn't account for output variation. |
| System prompt | High-priority instructions shaping model behavior. | Policy configuration. | Instructional, with no enforcement mechanism. The spec calls this "system" level priority, which is a generous word for something that prompt injection can override. |
| RAG | Retrieves external knowledge at query time, injects it into context. | Read-only data query with access control at the resource boundary. | Access controls must apply per-document at retrieval time. Without document-level enforcement, retrieval can blend sources across access classifications. |
| Workflow | Predefined code paths with LLM calls at specific steps. | Orchestrated application process. | The pipeline follows a fixed path; the model improvises the words within each step. Action surface is bounded, but output variance is real. |
| Tool call | Model requests execution of an external function. | Application-initiated API call. | In IDAM, the application decides which API to call. In AI, the model chooses which tool to invoke. The authorization question shifts: what validates the tool call before execution? |
| Agent | LLM dynamically directs its own tool use and process flow. | Non-human identity / service account. | Non-deterministic, tool-choosing, prompt-sensitive. An agent may invoke a tool the original policy didn't anticipate, based on content it retrieved at runtime. |
| MCP server | Standardized protocol connecting models to tools and data sources. | Integration endpoint / app connector. | MCP standardizes how to call a tool. Who is allowed, under whose authority, with what scope, and how the action gets revoked — all separate problems, all unaddressed by the protocol itself. |
| Multi-agent handoff | One agent delegates a subtask to another agent. | Multiple service accounts with delegation. | Delegation chains are dynamic and runtime-generated, assembled during execution based on the task. The receiving agent may invoke tools the originating agent couldn't. |
If you remember nothing else: Every analogy in this table is useful for orientation and dangerous for architecture. Use the middle columns to get your bearings. Read the last column before you open your mouth.
Key Terms for the Call
RAG (Retrieval-Augmented Generation) — Retrieves external documents at query time and injects them into the model's context so it can answer with knowledge it wasn't trained on.
- When it comes up: Buyer asks how the AI system stays current, or how it answers questions about their proprietary data without retraining the model.
- Don't confuse with: Fine-tuning. RAG adds knowledge at runtime. Fine-tuning changes the model's weights before deployment. Different mechanisms, different timing, different governance surface.
Context Engineering — Designing what goes into the model's context window: system instructions, retrieved documents, tool results, conversation history, and how all of it gets assembled, truncated, and refreshed.
- When it comes up: Buyer says "the AI gave a wrong answer" or asks how the system uses their data correctly. Before anyone reaches for a more complex pattern, context engineering is usually the actual problem and the actual fix.
- Don't confuse with: Prompt engineering. As Lesson 1 covered, prompt engineering is writing good instructions. Context engineering is designing the full information environment the model operates in. Prompt engineering is a subset.
Workflow vs. Agent — A workflow follows predefined code paths with LLM calls at fixed steps. An agent dynamically directs its own process and tool usage. Anthropic's published framework is the stable reference.
- When it comes up: Every AI architecture conversation. The single most important distinction in the curriculum.
- Don't confuse with: Each other. The industry used "agentic" loosely through 2024. Pin to the Anthropic definition: if the model decides what to do next, it's an agent. If the code decides, it's a workflow.
MCP (Model Context Protocol) — Open protocol standardizing how models connect to external tools and data sources. Now under the Linux Foundation.
- When it comes up: Buyer is evaluating how AI systems connect to their existing infrastructure — databases, ticketing systems, document stores — without bespoke integration for each one. In public sector, this is the "how does it talk to our systems of record" question.
- Don't confuse with: Function calling. Function calling is the mechanism by which a model requests a tool invocation. MCP is the protocol that standardizes how that tool is described, discovered, and connected.
Evals — Structured methods for measuring whether an AI system's outputs meet defined quality thresholds. Includes golden datasets, LLM-as-judge scoring, and retrieval-specific metrics like RAGAS.
- When it comes up: Buyer asks "how do you know it's working?" or "how do we measure accuracy?" In public sector, this connects directly to accountability and audit requirements. Evals are how teams answer those questions with data instead of demos.
- Don't confuse with: Traditional software testing. Evals measure probabilistic output quality against benchmarks — closer to grading an essay than running a unit test. A system can "pass" its evals on Tuesday and drift by Thursday.
Hallucination — The model generates text that is fluent and confident but factually wrong, unsupported by its training data or provided context. A property of how language models generate text, baked into the architecture.
- When it comes up: Every conversation. Buyers will say the word; you need to handle it precisely. Hallucination is the reason RAG exists, the reason evals matter, and the reason context engineering is a discipline rather than an afterthought.
- Don't confuse with: A solvable bug. Hallucination can be reduced through better context engineering, retrieval, and evals. It cannot be eliminated. Any vendor claiming otherwise is selling you something.
Source Map
| Recap Section | Source |
|---|---|
| Decision tree Q1, RAG, agentic RAG | Lesson 2: Retrieval-Augmented Generation |
| Context engineering, context window, system prompt, token | Lesson 1: Prompting and Context Engineering |
| Fine-tuning, practitioner escalation hierarchy | Lesson 3: Fine-Tuning vs. Prompting |
| Decision tree Q2/Q4, workflow vs. agent, five sub-patterns | Lesson 4: Workflows vs. Agents |
| Decision tree Q3, tool call, function calling, MCP | Lesson 5: Tool Use, Function Calling, and MCP |
| Evals, observability, tracing, hallucination | Lesson 6: Evals and Observability |
| Multi-agent handoff, coordination overhead | Lesson 7: Multi-Agent Patterns |
| Vocabulary collisions (all five terms) | Cross-cutting; primary anchors in Lessons 1 and 4 |
| Complexity principle, spectrum framing | Section Opener: The Spectrum of AI Applications |
Things to follow up on...
- MCP security is accumulating fast: A Trend Micro scan found 492 MCP servers running without basic security controls like client authentication or traffic encryption, and OWASP now has an MCP-specific Top 10 in beta.
- Multi-agent failure rates are sobering: The MAST study analyzed 1,642 execution traces across seven open-source frameworks and found failure rates ranging from 41% to 86.7%, with coordination breakdowns as the largest category at 36.9% of all failures.
- LoRA isn't identical to full fine-tuning: A peer-reviewed paper found that LoRA and full fine-tuning produce structurally different solutions, with LoRA introducing "intruder dimensions" that affect out-of-distribution behavior in ways that matter for complex tasks like code generation.
- Agent Skills are worth watching: Anthropic released Claude Skills as an open standard in December 2025, enabling agents to discover and dynamically load modular capability packages rather than encoding all domain expertise in static prompts.

