Market Pulse

Market Pulse

What Governance Governs

Companies with governance tooling deploy twelve times more AI projects to production. Only 4 of 13 frontier-autonomy agents disclose safety evaluations for agentic deployment. The first figure tells you whether a system has been authorized to run. The second tells you whether anyone has rigorously checked what it does once running. They share a vocabulary. They measure fundamentally different things. And the infrastructure accumulating around the first may be quietly relieving pressure to build the second.

What Governance Governs
Companies with governance tooling deploy twelve times more AI projects to production. Only 4 of 13 frontier-autonomy agents disclose safety evaluations for agentic deployment. The first figure tells you whether a system has been authorized to run. The second tells you whether anyone has rigorously checked what it does once running. They share a vocabulary. They measure fundamentally different things. And the infrastructure accumulating around the first may be quietly relieving pressure to build the second.
Research Digest
The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems
Cambridge, MIT, Harvard Law, Stanford, and five other institutions. Presented at ACM FAccT '26 in Montreal, June 25–28.
Enterprise agents designed for Level 1–2 autonomy regularly deploy at Level 3–5. Prompt injection vulnerabilities appear in 2 of 5 browser agents studied.
Research Digest
TrajAD: Trajectory Anomaly Detection for Trustworthy LLM Agents
Agent safety moves from static input/output filtering to mid-execution trajectory auditing, catching failures as they unfold rather than after damage lands.
With 89% observability adoption but only 52% offline evaluation, confident-but-wrong agents slip through the gap that general monitoring leaves open.
Research Digest
JADE: Expert-Grounded Dynamic Evaluation for Open-Ended Professional Tasks
Evidence-dependency gating invalidates conclusions built on refuted claims, surfacing cascading errors that aggregate scores quietly absorb.
Legal analysis, medical reporting, complex underwriting. Domains where "correct" resists simple definition and evaluation itself requires domain expertise.
Research Digest
Stanford HAI 2026 AI Index Report: Foundation Model Transparency
Google, Anthropic, and OpenAI stopped disclosing dataset sizes and training duration. Eighty of 95 notable 2025 models shipped without training code.
The index penalizes closed-source models, and the most capable models are increasingly closed-source, which may inflate the apparent transparency retreat.
Pricing Signal

Anthropic announced a billing split in May separating agent workloads from subscription pools into metered credits. The change was paused before its June 15 go-live, but the logic is visible elsewhere: GitHub Copilot shifted to usage-based AI Credits on June 1, and one agentic coding session now costs $30–40.
Anthropic's proposed credits were per-user, non-pooled, stop-on-empty. Teams would need to classify workloads, attribute costs to specific accounts, and pre-authorize agent consumption rather than discovering it post-hoc. Flat-rate subscriptions let organizations defer all of those questions indefinitely. A meter makes deferral uncomfortable.
Governance infrastructure asks whether an agent acted within scope. Billing forces something more granular: what counts as a unit of work, who pays for it, what happens when the budget runs out. That's specification work, and pricing imposes it whether organizations are ready or not. The competitive scramble that followed suggests nobody wants to force that clarity on customers yet. But an estimated 15–30x subsidy gap between subscription price and API-equivalent compute means the deferral has an expiration date.
Further Reading




Past Articles

Anthropic's Dynamic Workflows can now orchestrate a thousand subagents in parallel. The showcase was a Bun runtime port:...

GitHub Copilot moved to metered billing on June 1, and organizations can finally see where their AI development spend is...

Last week, Anthropic and OpenAI both responded to the same brute constraint: regulated enterprise data cannot leave the ...

A validation rule in Salesforce fires when a rep moves a deal to "Closed Lost" — won't save without a reason. Nobody cal...
