Open Weights vs. Open Source vs. Closed: What the Labels Actually Mean

By Leigh Garrity— May 9, 2026

Open Weights vs. Open Source vs. Closed: What the Labels Actually Mean

Open-Weights Models

What it is: The trained model file is publicly downloadable and deployable on infrastructure you control.

What it does: An open-weights model runs wherever you put it — your data center, a cloud tenant, an air-gapped environment. You send it inputs locally; it returns outputs locally. No API call leaves your perimeter during inference. You can fine-tune it on your own data, adjust its behavior within limits, and distribute the result under the terms of the original license.

Who's behind it: Meta (Llama 3 series), Mistral AI (Mistral 7B and select smaller models), DeepSeek (High-Flyer Capital, a Chinese quantitative hedge fund), and Alibaba (Qwen 2.5 series). Each publishes model weights under different license terms, covered in the comparison section.

What makes it distinct: You own the deployment. You do not own the understanding. The weights are a file containing billions of floating-point numbers encoding everything the model learned during training. You can run those numbers. You cannot read them in any meaningful sense. In a sovereignty conversation, that gap is the one that gets glossed over fastest.

“

IDAM Concept Mapping: The Signed Assertion Problem

An open-weights model is roughly analogous to receiving a signed SAML assertion from an IdP you don't control. You can validate the signature. You can read the claims. You can decide whether to trust the result. What you cannot do is inspect the IdP's internal logic that produced those claims — the policy evaluation, the attribute lookup, the session context. Open weights give you the assertion; they don't give you the IdP's source code. Where this analogy holds: you control where the assertion lands and what you do with it. Where it breaks: with SAML, you at least know the schema. With model weights, there's no schema — just behavior you can probe but not predict from first principles.

Models Marketed as Open Source

What it is: An open-weights model whose vendor, press coverage, or community has applied the "open source" label — accurately or not.

What it does: Functionally, the same thing as an open-weights model. The distinction is in what the label implies versus what you actually get.

Who's behind it: The same organizations as above, plus the broader AI community that has adopted "open source" as shorthand for "not a closed API." The Open Source Initiative's definition of open source requires, among other things, that the source be available for inspection and modification. For software, "source" means code. For an AI model, the equivalent is training code and training data — the inputs that produced the weights. No major frontier model publishes both. Meta publishes some training methodology documentation for Llama 3 but not the training dataset. Mistral publishes model weights but not training data. DeepSeek's R1 technical report describes the training approach in detail, which is more transparency than most, but the actual training data is not public. Qwen publishes weights; Alibaba's training corpus is proprietary.

What makes it distinct: The label is doing work it hasn't earned, and the gap matters specifically in public sector procurement. When a federal buyer says "open source," they often mean "auditable," the same way "open source software" implies that a security team can read the code and verify it does what it claims. That inference does not transfer to open-weights models. The weights are not the source. Calling Llama "open source" because the weights are downloadable is like calling a compiled binary "open source" because you can run it on your own machine.

In 2026, there is no major frontier AI model that meets the OSI definition of open source. There are open-weights models, which are genuinely useful and genuinely different from closed APIs, but they are not open source in the sense that a procurement attorney or a security assessor will mean when they use that phrase. That's the editorial point of this piece, and it belongs in the body, not a footnote.

Closed-API Models

What it is: A model whose weights never leave the vendor's infrastructure; access is exclusively through an API endpoint.

What it does: You send a prompt to an HTTPS endpoint. The vendor's infrastructure runs inference on their hardware using their model. You receive a completion. The weights, the hardware, the inference environment — none of it is accessible to you. You have a contractual relationship with an API, not a technical relationship with a model.

Who's behind it: OpenAI (GPT-4o, o3, o4-mini), Anthropic (Claude 3.5 Sonnet, Claude 3.7), Google (Gemini 1.5 Pro, Gemini 2.0). These are the models that dominate capability benchmarks and enterprise adoption, and they are entirely opaque by design.

What makes it distinct: Maximum opacity, maximum vendor dependency, and frequently the highest raw capability for complex reasoning tasks at the moment. You cannot host it. You cannot inspect it. You cannot modify it. What you can do is negotiate a contract, select a deployment region (within the vendor's options), and configure system prompts. The vendor controls everything below that layer.

Comparison: Four Traits That Matter in a Procurement Conversation

Trait-led analysis: each dimension runs across all three categories before moving to the next. Summary table at the end.

Auditability

This is where the conversation gets uncomfortable. "Auditable" means different things depending on who's using the word.

For closed-API models, auditability is essentially zero at the model level. You can audit your own usage logs. You cannot audit the model's weights, training data, or inference logic. Some vendors publish third-party safety evaluations, which is useful but not the same as an independent audit.

For open-weights models, auditability is real but limited in a specific way. You can run behavioral tests — red-teaming, adversarial prompting, capability evaluations — and those tests are meaningful. You cannot read the weights and understand what the model will do. The weights are not source code. A security team that "audits" an open-weights model by downloading it and running eval benchmarks has done something worth doing, but they have not done what a software audit does. The audit story for open-weights models is weaker than the marketing implies, and buyers who expect otherwise will be disappointed when a red-team test surfaces something the weights "shouldn't" do.

For models marketed as open source, the auditability story is identical to open weights, because they are open weights. The label doesn't change the technical reality.

Hosting Control

Open-weights models: full control. You choose the infrastructure, the network boundary, the access controls. The model runs where you put it.

Closed-API models: no control. You choose from the vendor's available deployment options. For some vendors, that includes FedRAMP-authorized environments; for others, it doesn't. (Deployment environment is covered in 4.4, not here.)

"Open source" marketed models: same as open weights.

License Constraints

This is where the open-weights category fractures, and it matters in procurement.

Mistral 7B uses Apache 2.0: genuinely permissive, OSI-approved, no usage restrictions. DeepSeek-R1 uses MIT license, which is similarly permissive. These are the cleanest options from a licensing standpoint.

Llama 3 uses Meta's Llama 3 Community License, which is not Apache 2.0 and is not OSI-approved. It prohibits using Llama outputs to train competing models, and commercial use requires a separate agreement for deployments above 700 million monthly active users. For most federal use cases, the MAU threshold is irrelevant, but the prohibition on training competing models may matter if an agency is building a fine-tuned derivative for redistribution.

Qwen 2.5 licensing varies by model size. Smaller models use Apache 2.0; larger models use Alibaba's Qwen license, which has its own restrictions. Check the specific model card before committing.

Closed-API models operate under commercial API agreements. The license question is replaced by a contract question.

What "Open" Actually Buys a Public Sector Buyer

Open weights buys you deployment control. Not auditability in the software sense. Not training data transparency. Not the ability to understand why the model produces a given output.

Deployment control means your data doesn't transit a vendor's API during inference. You can run the model in an environment that meets your agency's security requirements. You can fine-tune on sensitive data without that data leaving your perimeter. Those are genuine advantages — they're just not the same as what "open source" implies in a software procurement context.

Trait	Open Weights	"Open Source" (marketed)	Closed API
Auditability	Behavioral testing only; weights are not readable	Same as open weights	Vendor-published evals only
Hosting control	Full — deploy anywhere	Full — deploy anywhere	None — vendor infrastructure only
License constraints	Varies: Apache 2.0 (Mistral 7B, DeepSeek-R1), MIT, or custom restricted (Llama 3, some Qwen)	Same as open weights	Commercial API agreement
What "open" buys in practice	Deployment control; no training data transparency	Same as open weights; label overpromises	N/A — not "open" by any definition

“

IDAM Concept Mapping: Certificate Transparency vs. CA Audit

The auditability gap in open-weights models maps cleanly to the difference between certificate transparency logs and a CA audit. Certificate transparency lets you verify that a certificate was issued and is publicly logged — that's real and useful. It does not let you audit the CA's internal issuance logic, key management practices, or policy evaluation. Open-weights models give you the equivalent of the certificate: you can verify the model exists, test its behavior, and observe its outputs. You cannot audit the training process that produced it. In a buyer conversation, this distinction matters when someone says "we can audit it because we have the weights." You can test it. Audit implies something you can't do yet with any model, open or closed.

How to Say This in the Field

Scenario: The buyer says, "We want open source AI for sovereignty reasons."

Don't say	Do say	Why it matters
"That's actually open weights, not open source."	"What you're describing is an open-weights model — the file is downloadable, you control where it runs. That's different from open source in the software sense, and I want to make sure we're solving the right problem."	Corrects the term without making the buyer feel wrong.
"Open source AI doesn't really exist."	"True open source would mean the training data and code are public too. Almost no frontier model meets that bar. But that may not be what you actually need for sovereignty — let's figure out what you do need."	Lands the honest point without closing the door on the conversation.
"Open weights doesn't mean auditable."	"Downloading the weights gives you deployment control. It doesn't give you the ability to audit what the model learned — the weights are billions of numbers, not readable logic. What does your security team mean when they say 'auditable'?"	Prevents a compliance misunderstanding before it becomes a procurement commitment.
"DeepSeek is Chinese, so that's off the table."	"DeepSeek publishes its weights under MIT license, which is genuinely permissive. The provenance questions are real and worth discussing separately. Let's not conflate the licensing question with the supply chain question."	Keeps the conversation factual; prevents FUD from substituting for analysis.
"The Llama license is Apache 2.0."	"Llama uses Meta's community license, not Apache 2.0. There are usage restrictions worth reviewing — your legal team should look at it before you commit to a deployment architecture."	Prevents a licensing surprise in ATO review.
"Open weights means you can customize it however you want."	"You can fine-tune it on your own data, yes. You can't modify the base training, and your fine-tuning doesn't change what the model learned before you touched it."	Sets accurate expectations about what customization actually covers.
"This solves your FedRAMP problem."	"Model weights and FedRAMP are separate questions. FedRAMP covers the system that runs the model, not the model itself. We should talk about both."	Prevents a category error that will surface in ATO review.
"Open source means your team can inspect it."	"Inspection in the software sense — reading the logic — doesn't apply to model weights. Behavioral testing is available, and it's worth doing, but it's not the same thing."	Corrects the audit expectation precisely, before it becomes a requirement.
"Mistral is fully open source."	"Mistral's 7B model uses Apache 2.0, which is genuinely permissive. Their larger models are API-only. Which model is your team looking at?"	Prevents a blanket claim from becoming a procurement assumption.
"You should just use a closed model if you need capability."	"Closed models are a legitimate option if the capability fits. The tradeoff is zero hosting control and zero visibility into the weights. Whether that matters depends on your data classification and your agency's risk posture — that's a conversation worth having explicitly."	Frames the tradeoff without pushing a direction the buyer hasn't chosen.

“

IDAM Concept Mapping: Policy Enforcement vs. Technical Enforcement

When a buyer says "sovereignty," they're often conflating two things: data doesn't leave our perimeter, and we understand and control the system's behavior. In IDAM, this maps to the difference between network-level access control and policy-based authorization. One is about where traffic goes; the other is about what it's allowed to do. Open weights addresses the first concern — if you control the hosting, inference stays local. It doesn't address the second — you still can't fully specify or verify what the model will do in edge cases. This is a governance problem, not a deployment problem, and no model category solves it. The buyer who understands this distinction will write better requirements; the one who doesn't will be surprised when a deployed open-weights model does something the weights "shouldn't" have learned.

The honest summary, for the parking lot: "open" in AI mostly means you control the hosting. The weights are inscrutable whether you download them or not. That's a genuine thing. Just not the same thing as open source software, and a buyer who thinks it is will write procurement requirements that nothing can satisfy.