A consumer tells their AI agent to buy running shoes under $150. The agent finds a pair, checks the spending limit, confirms the merchant category, completes the purchase through a tokenized payment flow. The shoes arrive. They're the wrong shade of blue, or the right shade but the consumer swears they said trail runners.
A chargeback gets filed.
Today's dispute process already involves multiple parties reconstructing the same event from different angles. The consumer knows what they intended. The merchant knows what they shipped. The issuing bank knows what it approved. The system works because these fragments are complementary. Enough overlap exists to adjudicate.
Agent-initiated transactions break that complementarity. When Visa announced its partnership with OpenAI to bring payments into ChatGPT, the infrastructure they described was real: tokenized credentials, spending controls, merchant restrictions, pre-transaction consent. But Visa's public rulebook also specifies that the Agentic Payment Provider supplies the token and instruction context while playing no role in authorization, clearing, or settlement. The consumer bears responsibility for the agent's actions "as if" the consumer initiated the transaction directly. So the entity best positioned to explain what actually happened occupies no formal seat in the resolution process.
The legibility problem lives right there. You can log every step the agent took. The instruction, the search queries, the product comparison, the selection rationale. Visa's token enhancements already add data about transaction type and assurance levels based on provisioning history. Recording is arriving. But a log is a single narrative, and a dispute requires four.
Each party needs something different from the same purchase event. The consumer needs a natural-language account that maps back to their original instruction, something that lets them say "yes, that's what I meant" or "no, I said trail runners." The merchant needs machine-readable authorization evidence proving a legitimate buyer stood behind the transaction. Visa needs standardized risk signals for scoring, routing, and network rule enforcement. The agent platform needs auditable proof that it operated within the consumer's mandate.
These are four incompatible framings of the same moment. The consumer's version centers on intent, the merchant's on authorization. The network cares about classification. The platform cares about compliance with the consumer's mandate. Every record, however comprehensive, still has to be translated into each party's accountability language before it's useful.
Visa's Jack Forestell told the AP that the central dispute question remains whether the consumer intended the purchase and whether the merchant processed it correctly, with new ambiguity arising in the space between those two ends. That space is where the agent lives. What counts as evidence of intent when the person who held the intent delegated execution to software? Answering that takes design decisions about what gets captured, how it's structured, who can access which version, and what weight an issuer gives agent-platform evidence against a consumer's claim. Public materials don't yet specify what agent-platform evidence enters a chargeback file or how issuers weigh it against cardholder intent claims. Payment disputes will force the question, because they're the one moment in commerce where every party must independently produce a coherent account of the same event. "The agent did it" is a sentence that hides four different questions, and each one belongs to a different party.
- Visa's Trusted Agent Protocol: Axios reported that Visa launched a protocol with Cloudflare to help merchants distinguish legitimate AI shopping agents from malicious bots, with Forestell saying the protocol was intended to be open rather than Visa-specific.
- Agent identity as design gap: OWASP's Non-Human Identities Top 10 catalogs risks around application identities that use secrets as credentials, including overprivileged access, long-lived secrets, and improper offboarding, all of which sharpen when agents act on behalf of consumers.
- Consumer protection's open edge: Existing CFPB frameworks under Regulation Z define unauthorized use, billing error investigation, and liability limits for credit transactions, but no current public guidance specifically addresses purchases initiated by AI agents acting under delegated consumer authority.
- Action binding beyond payments: The emerging OpenTelemetry GenAI semantic conventions now define attributes for agent identity, tool calls, and input messages, which suggests the multi-audience evidence problem will extend well beyond commerce into any domain where agent actions need post-hoc reconstruction.

