A model card is a structured disclosure artifact that documents a machine learning model's intended use, performance characteristics, evaluation results, and known limitations. A system card extends that scope to multi-component AI deployments — documenting how models, retrieval systems, and orchestration layers interact. A dataset datasheet documents the characteristics of training and evaluation data. All three are increasingly specified by regulation, not invented by vendors. None of them are marketing documents, and treating them as such is the fastest way to fail an audit conversation with a compliance team that has read the actual requirements.
This genre exists because regulators recognized something the enterprise compliance world is still absorbing: a system that generates outputs probabilistically requires different documentation than a system that processes data deterministically. The frameworks built for the second kind don't capture what you need to know about the first.
What Three Frameworks Require
The EU AI Act, NIST AI RMF, and ISO/IEC 42001 converge on a core set of documentation obligations while diverging significantly in legal weight and specificity.
The EU AI Act (published in the Official Journal of the European Union, August 2024) requires technical documentation for high-risk AI systems under Article 11 and Annex IV. That documentation must include the system's intended purpose, the design logic and development process, training methodologies, evaluation datasets and metrics, performance across relevant subgroups, known limitations, and a post-market monitoring plan. For high-risk systems, these are prerequisites for CE marking and market access, not aspirational targets. Implementation timelines for high-risk system requirements are phasing in through 2026 and 2027; verify current standing against the official text before citing specific dates in a compliance conversation.
NIST AI RMF 1.0 (January 2023) is voluntary guidance, not binding regulation. Its MEASURE function is where documentation obligations concentrate: evaluation results, uncertainty quantification, performance metrics across demographic groups, and ongoing monitoring records. GOVERN covers accountability structures and documentation of who made which decisions about the AI system. The AI RMF Playbook provides specific suggested practices, but "suggested" is doing real work in that sentence. No enforcement mechanism exists at the federal level for private sector AI systems absent sector-specific regulation.
ISO/IEC 42001 (December 2023) is a certifiable management system standard — the AI equivalent of ISO 27001. It requires a documented AI policy, AI system impact assessments, records of AI-related risks and controls, and evidence of continual monitoring. Organizations seeking certification must demonstrate that their AI management system is operational, not just documented. Third-party certification bodies are still developing audit competency for this standard; the market for ISO/IEC 42001 certification is early.
All three frameworks share a core: evaluation evidence, performance metrics, and post-deployment monitoring records. The divergence is in enforcement. The EU AI Act is prescriptive and legally binding for covered systems; NIST AI RMF is guidance that federal agencies are increasingly expected to follow under OMB M-24-10; ISO/IEC 42001 is a voluntary certifiable standard that some procurement processes are beginning to require as a qualifier.
The Structural Gap
The SOC 2 analogy, which your compliance team will reach for immediately, breaks at a specific point worth naming.
SOC 2 was designed for deterministic systems. A control either exists or it doesn't. Access to production requires MFA — evidence is a configuration screenshot and an access log showing MFA events. The auditor's question is: was the control in place? The evidence architecture answers that question cleanly because the system behaves the same way every time the control is applied correctly.
An AI system configured identically can produce different outputs. Which model version processed the request? Which prompt was constructed from the user's input, and how? Which documents were retrieved as context, by which retrieval model, using which embedding model? What was the temperature setting? What was the output, and can you reconstruct the exact chain that produced it for any given decision?
The SOC 2 auditor asks whether the control was configured. The AI auditor, and increasingly the regulator, asks whether you can reconstruct the decision chain for any specific output after the fact. These are structurally different questions. A SOC 2 control narrative captures control state. An AI audit trail must capture decision provenance. Satisfying the second requirement means rethinking what an audit record contains, not adding more records to the first kind of report.
A CISO asking "show me evidence that the model behaved as configured" is asking a question that has no clean answer in a traditional audit framework, because "behaved as configured" means something different when the system's outputs are probabilistic. A compliance team trying to map AI system documentation to an existing SOC 2 Type II report will find that the control categories don't have slots for model version, prompt construction logic, or retrieval context — because those concepts didn't exist when the Trust Services Criteria were written.
What This Looks Like in Practice
In federal procurement, OMB M-24-10 (March 2024) directs agencies to implement AI governance practices including documentation of AI use cases and associated risks. Agencies are beginning to ask AI vendors for model cards as part of procurement qualification — not universally, and not yet with standardized requirements, but the pattern is emerging. A vendor who responds to that request with a product one-pager has misread what's being asked.
The more common scenario right now is a compliance team trying to map an existing vendor's AI documentation to an audit framework they already use. The question they're actually asking is: does this vendor's documentation give us enough information to answer a regulator's question about a specific AI-generated output? For most current AI vendor documentation, the answer is no. The documentation was designed to describe the system, not to support post-hoc decision reconstruction.
Enterprise-grade AI audit trail infrastructure captures model version at inference time, prompt content (with appropriate data handling), retrieved context and its sources, output, and timestamp — for every inference, queryable after the fact. Some organizations are building this; most are not. The logging infrastructure required is closer to distributed tracing than to traditional audit logging, and the data volumes are significant.
Okta Concept Mapping
The SOC 2 audit log is the natural anchor, and it holds up to a point. SOC 2 audit logging captures who accessed what, when, and whether they were authorized. That maps cleanly to the existence requirement: did you log AI system activity? The analogy breaks at content. An access log entry answers a binary authorization question. An AI decision-chain record must answer a reconstruction question: given this output, can you reproduce the exact inputs, model state, and context that produced it? SOC 2 was never designed to answer that, and retrofitting it to do so requires rethinking what an audit record contains, not just adding more records.
EU AI Act implementation timelines are subject to change as member states publish national guidance. Verify current phase-in dates against the official EU AI Act text before citing them in compliance conversations. This piece reflects the regulatory landscape as of May 2026.

