Function Calling, MCP, and Skills Are Layers, Not Choices

Function calling formats the request, MCP transports it, Skills decide if it should happen. Three layers, zero governance built in.

By Leigh Garrity— May 9, 2026

Function calling formats the request, MCP transports it, Skills decide if it should happen. Three layers, zero governance built in.

Function Calling

What it is: A model-level API feature that lets an LLM output structured requests to invoke external tools instead of generating plain text.

What it does: You define a set of functions with names, parameters, and descriptions, then pass them alongside a prompt. The model decides whether a function is relevant, generates a structured JSON call with the right arguments, and hands it back. Your code executes the function, returns the result, and the model incorporates it. The model never executes anything itself. It writes the request. Your application runs it.

Who's behind it: Every major model provider ships their own version. OpenAI introduced it in June 2023. Anthropic calls it "tool use." Google calls it "function declarations." The core pattern is identical across all three. The JSON schemas differ just enough that code written for one provider doesn't run on another without adaptation.

What makes it distinct: Function calling is what lets a model reach tools instead of generating text about what it would do if it could do things. It handles the format of the ask. Full stop. Which tools are available, how to reach them, whether reaching them is wise — all outside its scope.

MCP (Model Context Protocol)

What it is: An open protocol that standardizes how AI applications discover and call tools across vendors, replacing per-integration connector code with a common wire format.

What it does: An MCP server wraps a tool, a database, or an API and exposes it through a standard interface. An MCP client (your AI application) connects to the server, discovers what tools are available, and invokes them using a consistent protocol regardless of what sits behind the server. Build the server once, and it works with Claude, GPT, Gemini, or any MCP-compatible client. Anthropic open-sourced MCP in November 2024. By early 2026, it had surpassed 97 million monthly SDK downloads.

Who's behind it: Anthropic created it. In December 2025, Anthropic donated MCP to the Agentic AI Foundation (AAIF), a directed fund under the Linux Foundation, co-founded by Anthropic, Block, and OpenAI. No single vendor owns the protocol anymore. That matters for public sector conversations about vendor lock-in.

What makes it distinct: MCP solves connectivity. Connectivity, specifically, means: discovery, transport, invocation, and result formatting. Deciding which tools to use, in what order, or whether using them is appropriate for the task at hand — that belongs to a different layer entirely. MCP will faithfully connect your agent to every tool on the network. It has no opinion about whether that's a good idea. Most of the security problems now accumulating around MCP trace back to this exact boundary.

Skills

What it is: An open specification that packages procedural knowledge into a standard format agents can discover and load dynamically.

What it does: A Skill is a folder containing a SKILL.md file: a structured markdown document with a metadata header (name and description at minimum) and a body of task-specific instructions that tell an agent how to perform a particular job. At startup, the agent loads only each Skill's name and description into its context, roughly 30–50 tokens per Skill. When the agent encounters a relevant task, it loads the full instructions. Skills can bundle scripts, reference materials, templates, and workflow definitions alongside the core file. The median Skill body runs about 1,400 tokens, small enough to coexist with tool schemas and planning context without crowding the window.

Who's behind it: Anthropic released the spec in December 2025. Within 48 hours, Microsoft integrated it into VS Code via Copilot and OpenAI added support to ChatGPT and Codex CLI. By March 2026, 32 tools from competing companies adopted the same SKILL.md format, including Google's Gemini CLI, JetBrains' Junie, and AWS's Kiro.

What makes it distinct: Skills are the procedural knowledge layer. They encode what a task involves, what order to do things in, what tools to reach for, and what conditions to check before acting. Where function calling gives the model a voice and MCP gives it a phone line, Skills give it a playbook. The playbook can be as simple as a checklist or as detailed as a multi-step workflow with branching logic. The agent loads the relevant playbook when it recognizes the task, then follows it.

IDAM Anchor: Function Calling and OAuth Scopes

Defining available functions for a model works like defining OAuth scopes for an application: you declare what the caller can request, and the boundary is set at definition time. Where your OAuth intuition stops helping: scopes are enforced by the authorization server, while function definitions are suggestions to a probabilistic model. The model almost always respects them. Almost. That gap between "enforced" and "almost always respected" is where the security conversation lives.

Three Dimensions That Matter in a Buyer Conversation

The structure here is trait-led analysis across three dimensions: where judgment lives, what the security surface looks like, and how durable each layer is likely to be. I chose this over scenario mapping or clustering because these three traits map directly to the questions a buyer's architecture and security teams will raise, even if they phrase them differently. Each subject appears against each dimension.

Where Judgment Lives

Function calling contains no judgment. The model decides whether to call a function based on the prompt and its training. The function definition constrains what's callable but not when or why.

MCP contains no judgment either. The protocol connects tools and transports requests. An MCP server exposes capabilities. Whether the agent should use those capabilities at a given moment is someone else's problem.

Skills are where judgment currently sits. A Skill encodes procedural knowledge: when to use a tool, in what sequence, with what preconditions. A Skill can instruct an agent to "always verify with the user before calling this API" or "only use the production database tool after confirming the environment." These are soft controls, though. The agent follows the Skill the way a capable junior follows a runbook: usually correctly, not guaranteed. The instructions are read by the model, not enforced by the platform. Nothing in the Skills spec prevents the agent from deviating at the API level.

That space between instructional judgment and enforced judgment is exactly where the next wave of governance tooling will land. As of early 2026, no shipping product enforces tool-use policy at the agent layer in a way that survives adversarial input. That's the current state.

Security Surface

Function calling has a narrow surface. The primary risk is prompt injection: an attacker manipulates the model's input to trigger function calls the user didn't intend. The attack requires access to the prompt or the content the model processes. Mitigation is application-level: input validation, output filtering, limiting which functions are available per session.

MCP has a wide and growing surface. Thirty CVEs were filed in January and February 2026 alone, ranging from path traversals to a CVSS 9.6 remote code execution flaw. The root causes are not exotic: missing input validation, absent authentication, blind trust in tool descriptions. Tool poisoning is a demonstrated attack class where malicious instructions injected into MCP tool descriptions hijack agent behavior. The WhatsApp MCP attack in April 2025 exfiltrated chat histories this way, with no code exploit required. OX Security disclosed a systemic vulnerability in Anthropic's MCP SDK affecting implementations across Python, TypeScript, Java, and Rust. Anthropic declined to modify the protocol's architecture, calling the behavior "expected." Independent scans found 38% of surveyed MCP servers running without authentication. The spec recommends OAuth 2.1 for remote servers, but authorization remains optional in the stable spec. The draft spec tightens this significantly, making RFC 9728 Protected Resource Metadata mandatory rather than recommended. That draft has not shipped as a stable release.

Skills have a social-engineering surface. A malicious Skill can direct an agent to invoke tools or execute code in ways that don't match its stated purpose. The official documentation warns that untrusted Skills should be audited before use, because a Skill with access to tools can exfiltrate data or escalate privileges through instruction alone. The attack vector is the Skill author, not the protocol. A supply-chain problem, not a wire-protocol problem.

IDAM Anchor: MCP Servers and Federation Connectors

An MCP server looks like a SCIM connector or a federation integration point: it standardizes how systems talk to each other. Where your connector intuition breaks: a SCIM connector operates under an enterprise's authorization policy. An MCP server, today, often operates under whatever authentication the developer remembered to configure. Thirty-eight percent of surveyed servers have no auth at all. Having a connector and having authorization are two different things. Your buyer's security team will find that gap first.

Durability

Function calling is durable in concept, fragile in implementation. The idea that models should output structured tool requests is settled. The specific JSON formats each provider uses are not. OpenAI, Anthropic, and Google all ship slightly different schemas, and code written for one doesn't run on another without adaptation. That fragmentation creates practical brittleness: teams building multi-model systems spend real engineering time on format translation, which is exactly what MCP was built to abstract away.

MCP is at the critical juncture. Protocols that solve connectivity without governance have a specific historical pattern. XML-RPC gave way to SOAP gave way to REST. ChatGPT plugins launched in March 2023 and are already deprecated. The pattern: early connectivity standards that fail to grow a governance layer tend to get replaced by something that ships with one. MCP's donation to the Linux Foundation was an explicit move toward governance. The 2026 roadmap targets transport scalability, agent-to-agent communication, and enterprise readiness. Whether the governance matures faster than the security debt accumulates is genuinely uncertain. Anyone who tells you they know is selling something.

Skills are early. The spec is deliberately minimal. Adoption velocity is remarkable (32 implementations in three months), but the format is under-specified in areas like versioning and capability negotiation. Skills' durability depends on whether the "folder with a markdown file" pattern proves sufficient for enterprise use or whether organizations need something more structured. The bet: simplicity drives adoption, and adoption creates durability. That bet has worked before. It has also failed before.

IDAM Anchor: The Catalog-Entitlement Gap

MCP tool discovery lets an agent see what's available. Skills tell the agent when to use what it sees. Neither enforces authorization. In IDAM terms, this is the difference between an application appearing in a catalog and a user having an entitlement to use it. Your public sector buyers will recognize this gap immediately. It's the gap their compliance frameworks exist to close.

How to Say This in the Field

Don't say	Do say	Why it matters
"MCP is how AI agents call tools."	"MCP is the transport layer. It handles how the call travels, not whether the call should happen."	Buyers conflate connectivity with control; separating them shows you understand the architecture.
"Function calling and MCP are competing standards."	"Function calling is how the model formats a request. MCP is how that request reaches the tool. They're different layers of the same stack."	Framing them as layers matches how architects actually think about this.
"Skills are like plugins for AI agents."	"Skills package procedural knowledge. They tell the agent what a task involves and when to use which tools. Think of them as runbooks the agent loads on demand."	"Plugin" implies code execution; Skills are instructions, and the distinction matters for security conversations.
"MCP is secure because it supports OAuth."	"The MCP spec recommends OAuth 2.1 for remote servers, but auth is optional in the current stable spec and a lot of deployed servers don't implement it."	Overstating MCP's security posture will get corrected by the buyer's security team, and you lose the room.
"Anthropic controls MCP."	"Anthropic created MCP and donated it to the Linux Foundation in late 2025. It's governed by the same foundation that manages the A2A protocol."	Governance neutrality matters to public sector buyers evaluating vendor lock-in risk.
"Skills solve the security problem."	"Skills add a judgment layer, but it's instructional, not enforced. The agent follows the Skill the way someone follows a runbook. Usually correctly."	Honest framing of soft controls builds more trust than overclaiming.
"MCP is the future of AI integration."	"MCP has the adoption momentum. Whether it matures its governance faster than its security debt grows is the open question right now."	Shows you're tracking the real risk, not just the hype cycle.
"You need all three of these."	"These are layers, not choices. Function calling is in the model. MCP is in the transport. Skills are in the orchestration. The question is which layers your environment has governance around."	Reframes from product selection to governance posture, which is where the identity conversation starts.
"Tool poisoning is a theoretical risk."	"Tool poisoning is a demonstrated attack. Malicious instructions in MCP tool descriptions have been used to exfiltrate data without any code exploit."	"Theoretical" will get you corrected by anyone who's read the CVE reports.
"We should wait for MCP to mature."	"MCP is already in production across the ecosystem. The question is what governance you wrap around it."	Dismissing MCP loses credibility with buyers who are already using it.

The through-line: these three layers solve connectivity at different altitudes. Governance lives above all three, and nobody's shipped a durable answer for it yet. That's the design space your next conversation should be about.

Things to follow up on...

MCP's draft authorization spec: The stable MCP spec (2025-11-25) makes OAuth optional for remote servers, but the draft spec upgrades RFC 9728 Protected Resource Metadata to mandatory, which would close the biggest gap discussed in this piece if it ships.
OWASP MCP Top 10: OWASP has released a beta framework mapping the ten most critical MCP security risks, and every item on the list already has at least one confirmed CVE or documented exploit in the wild.
Skills token economics at scale: A Bosch Research and Carnegie Mellon study analyzing over 40,000 publicly listed Skills found the median Skill body is 1,414 tokens, which matters because tool definitions alone can consume 55K tokens across five MCP servers before a single Skill loads.
OX Security's architectural RCE disclosure: OX Security found a systemic vulnerability baked into Anthropic's MCP SDK across every supported language, affecting over 7,000 publicly accessible servers, and Anthropic has declined to patch it at the protocol level.