The Diagnostic Ladder
Four questions, synthesized from the patterns you just worked through. Each one gates whether you need the next rung. The governing principle from the section opener: start at the simplest pattern that works. Escalate only when you can name the problem the current rung can't solve.
| # | Diagnostic Question | YES | NO |
|---|---|---|---|
| 1 | Is the knowledge already in the model? | Single LLM call may suffice | Add retrieval (RAG) |
| 2 | Is the path predictable? | Workflow — code controls the sequence | Agent — model directs its own path |
| 3 | Does the model choose which tools to call? | Agent-level authorization problem | Workflow-level; code governs tool calls |
| 4 | Does it need to coordinate with other agents? | Multi-agent — delegation chains compound | Single agent loop |
Q3 deserves a flag. Recall from the tool use piece: tools appear at multiple rungs. Workflows use tools. Agents use tools. The gate is who picks the tool, the code or the model. That single distinction reshapes the identity problem completely.
Start at the simplest rung. Every rung you climb adds authorization surface you have to govern. If you can't name the problem the current rung can't solve, you haven't earned the next one.
The Patterns
Single LLM Call
One prompt in, one completion out. No retrieval, no tools, no loops.
- When it comes up: Buyer describes a chatbot, a summarizer, a classification endpoint. "We're using GPT for internal Q&A." Identity is simple here: who can call the model?
- Don't confuse with: RAG. If the buyer mentions "pulling in documents" or "searching our knowledge base," they've left this rung.
RAG (Retrieval-Augmented Generation)
External knowledge retrieved and injected into the model's context before generation. As you saw in the RAG lesson, the mechanism is embed-index-retrieve-generate.
- When it comes up: Buyer says "we're grounding the model on our data" or "it searches our docs first." Now the identity question shifts: what data can be retrieved, and do the original access controls survive the vector index?
- Don't confuse with: Fine-tuning. RAG adds knowledge at query time. Fine-tuning bakes behavior into model weights. A buyer who says "we trained it on our data" might mean either. Ask which.
Fine-Tuning (not a rung — a model customization method)
Modifying model weights to change behavior, style, or format. The fine-tuning lesson made the point clearly: this shapes how the model responds. It has almost no reliable effect on what the model knows. Knowledge injection through fine-tuning is unreliable; RAG is the tool for grounding in current data.
- When it comes up: Buyer says "we trained a custom model" or "we fine-tuned it on our corpus." Identity concern: who controls the training data pipeline, and does the fine-tuned model inherit or lose the base model's safety constraints?
- Don't confuse with: RAG. If the buyer needs current, document-level knowledge with access controls, fine-tuning is the wrong tool. This is the single most common confusion in discovery.
Workflow
Multiple steps, fixed orchestration. Code controls the sequence. Anthropic names five patterns: chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer. Models may fill individual steps, but the path is predetermined.
- When it comes up: Buyer describes a pipeline. "First it classifies, then routes, then generates." What matters for identity: who can trigger each path, and what credentials execute at each step?
- Don't confuse with: Agent. If the buyer says "it decides what to do next based on what it finds," that's past workflow territory. Recall the core distinction from the workflows piece: workflows don't improvise.
Agent
Model dynamically directs its own process and tool usage. Plans, acts, observes, adjusts. The loop runs until the task completes or a stop condition fires. Anthropic's definition is the anchor worth holding onto when the buyer uses "agent" to mean six different things.
- When it comes up: Buyer says "it figures out the steps on its own" or "it can browse, search, and file tickets." The hard identity question surfaces here: who constrains dynamic tool choice, and can you revoke the agent without breaking the human user's access?
- Don't confuse with: Workflow with tools. The test is autonomy. A workflow calls five APIs in a fixed sequence. An agent calls the same five APIs in an order it determined at runtime. The difference is who chose the path.
Multi-Agent System
Multiple agents coordinate via supervisor-worker, debate, or handoff patterns. Each agent may carry its own tools, context, and credentials. The multi-agent lesson made one thing abundantly clear: coordination overhead compounds fast.
- When it comes up: Buyer describes specialized agents handing off tasks. "One agent researches, another drafts, a third reviews." Identity at this rung is about delegation: how does delegated authority propagate across agents, and where does it terminate?
- Don't confuse with: A workflow with parallel steps. Parallel workflows have independent steps with independent credentials. Multi-agent systems have delegation chains where authorization decisions compound.
If you remember nothing else: Ask the buyer: "Does the model decide what to do next, or does your code?" The answer places them on the ladder.
Vocabulary Collision Tables
Terms Where Your IDAM Intuition Misleads
| AI Term | What It Means in AI | IDAM Equivalent | Key Divergence |
|---|---|---|---|
| Token | Unit of text (~4 characters). Billing and context-window unit. | Bearer token, ID token, refresh token | LLM tokens carry zero authorization semantics. "500K tokens/day" is a cost statement, never a credential count. |
| Agent | System where an LLM dynamically directs its own tool use | Endpoint software, daemon, background process | Buyers call everything an agent. Only systems where the model directs tool choice at runtime raise the authorization-at-runtime problem. |
| Scope | Task constraint, tool availability, often expressed in natural language | OAuth scope string — machine-enforceable, checked by authorization server | "Only summarize HR policy" in a system prompt is an instruction, nothing more. Natural-language scope carries zero enforcement weight. |
| Session | Conversation thread, retained chat state, memory store | Stateful authenticated interaction with expiration and revocation | Ending a chat session does not revoke downstream credentials, clear retained logs, or remove agent memory. |
| Context | Everything assembled for a model call: prompt, retrieved docs, chat history, tool outputs | Risk signals — device posture, network, location, behavior | AI context is simultaneously evidence and instruction. A retrieved document can carry facts and prompt injection in the same payload. |
| Identity | Human user, AI agent, app, model provider, MCP server, downstream API principal | Managed subject — human, service account, device, workload | "The AI did it" tells you nothing auditable. You need to know which principal acted at each hop. |
Identity Model Per Pattern
Your IDAM knowledge genuinely helps here, up to a point. The "Where It Breaks" column marks that point for each pattern.
| AI Pattern | IDAM Equivalent | Where It Holds | Where It Breaks |
|---|---|---|---|
| Single LLM call | API call to a service | Authentication, rate limiting, access logging all apply | Same input produces different outputs. You can't acceptance-test it like a deterministic endpoint. Standard monitoring answers "is it running?" and stops there. Whether it's right requires evals. |
| RAG | Search with ACL-trimmed results | Per-document access control is the right instinct | Vector indexes often flatten ACLs at embed time. Access controls must be re-applied at retrieval, not assumed from the source system. |
| Workflow | Orchestrated application process | Step-level authorization, credential-per-step, audit trail | A workflow step can return 200 OK with a wrong answer. The model fills content probabilistically inside a deterministic frame. Evals exist for exactly this reason. |
| Agent (including MCP tool discovery) | Service account / non-human identity | Credential issuance, scoping, revocation all apply | Static service-account permissions miss dynamic tool choice. MCP standardizes tool discovery but does not enforce resource-level authorization. Credentials must be short-lived and scoped per task. |
| Multi-agent | Multiple service accounts with delegation | Each agent needs its own identity, credentials, audit trail | Delegation chains form at runtime and can't be fully pre-configured. Agent A grants Agent B access to a tool neither was explicitly authorized for in combination. Blast radius is non-deterministic. |
When a buyer uses any of these terms, pause before assuming you know what they mean. Same word. Different mechanism entirely.
Source Index
Every entry above traces to the Patterns & Practice section. Use this to navigate back when a concept needs more depth than the recap provides.
| Concept | Source Article |
|---|---|
| Complexity ladder, diagnostic questions, "start simple" principle | The Spectrum of AI Applications (Section Opener) |
| Context engineering, system prompt, context failure modes | Prompting and Context Engineering |
| RAG mechanics, access controls per-document, naive RAG failures | Retrieval-Augmented Generation |
| Fine-tuning vs. prompting, knowledge injection misconception | Fine-Tuning vs. Prompting |
| Workflows vs. agents, five workflow patterns, agent loop | Workflows vs. Agents |
| Tool use, function calling, MCP integration and authorization | Tool Use, Function Calling, and MCP |
| Evals, observability, tracing, tool-call failure modes | Evals and Observability |
| Multi-agent topologies, delegation chains, coordination overhead | Multi-Agent Patterns |
| Vocabulary collisions (token, agent, scope, session, context, identity) | Introduced across all lessons; consolidated in section glossary |
Things to follow up on...
- MCP's accumulating security surface: Trend Micro's scan found 492 MCP servers running without basic security controls, and the OWASP MCP Top 10 is now in beta — worth tracking as the auth story matures.
- Multi-agent failure rates in practice: The MAST study analyzed 1,642 execution traces across seven open-source frameworks and found failure rates ranging from 41% to 86.7%, with coordination breakdowns as the largest category — useful ammunition when a buyer's pitch outpaces their architecture.
- Anthropic's agent-building guidance: The canonical Building Effective Agents piece defines the workflow-vs-agent distinction this entire ladder rests on, and it's worth reading in full for the five workflow pattern descriptions and the complexity warnings.
- Context engineering as a discipline: Anthropic Engineering published a detailed guide on effective context engineering for AI agents that explains why context assembly is now the core production skill — and why prompt engineering alone stopped being sufficient.

