Practitioner's Corner

Practitioner's Corner

The Work the Spreadsheet Can't See

A single agent step running at 95% reliability sounds fine. Chain twenty steps and you're below 36%. That gap has to be managed by someone: prompt maintenance, drift detection, failure triage across layers that didn't exist before deployment. None of it appears in the business case that funded the project. The accounting framework used to justify automation has no line item for work the automation itself generates. The costs are real, and they accumulate where no instrument exists to catch them.

The Work the Spreadsheet Can't See
Asingle agent step running at 95% reliability sounds fine. Chain twenty steps and you're below 36%. That gap has to be managed by someone: prompt maintenance, drift detection, failure triage across layers that didn't exist before deployment. None of it appears in the business case that funded the project. The accounting framework used to justify automation has no line item for work the automation itself generates. The costs are real, and they accumulate where no instrument exists to catch them.
Sumeet Vaidya and the Distance Between Writing Code and Shipping It

An AI agent writes a code change in seconds. It compiles. It passes the sandbox. It touches a database schema, a caching layer, an auth service, and nobody finds out whether it actually works until the cost of finding out has already multiplied. Sumeet Vaidya spent a decade at Facebook, Uber, and Discord watching that distance between "looks right" and "works in production" grow wider with every new service dependency. With Crafting, he's placed a very specific bet on where the wall is, and it lives in the space between generated code and the production environment that has to accept it.
Sumeet Vaidya and the Distance Between Writing Code and Shipping It
An AI agent writes a code change in seconds. It compiles. It passes the sandbox. It touches a database schema, a caching layer, an auth service, and nobody finds out whether it actually works until the cost of finding out has already multiplied. Sumeet Vaidya spent a decade at Facebook, Uber, and Discord watching that distance between "looks right" and "works in production" grow wider with every new service dependency. With Crafting, he's placed a very specific bet on where the wall is, and it lives in the space between generated code and the production environment that has to accept it.


The Professional Noticer Keeping AI Agents From Quietly Losing Their Minds
CONTINUE READINGThe Maintenance Curve

Gartner predicts over 40% of agentic AI projects face cancellation by end of 2027. Most will be narrated as technology failures. Look closer and the pattern is financial: teams that built fast discover they've inherited platform-scale obligations on a prototype-scale budget.
The trajectory is remarkably consistent. Ship an agent, wire up basic logging, call it supervised. Within months, evaluation suites, audit infrastructure, model migration cycles, and governance layers arrive uninvited. Engineering maintenance alone runs $3,000 to $6,000 monthly per mid-complexity agent. Development environments, with their clean data and cooperative inputs, never hinted at any of this.
By the time the true operating cost surfaces, the project is already under executive scrutiny with no clean exit.
Further Reading





