TinyFish | Practitioner's Corner

The Invisible Overhead

The Work the Spreadsheet Can't See

By Nora Kaplan— March 25, 2026

Feature image for article: The Work the Spreadsheet Can't See

A single agent step running at 95% reliability sounds fine. Chain twenty steps and you're below 36%. That gap has to be managed by someone: prompt maintenance, drift detection, failure triage across layers that didn't exist before deployment. None of it appears in the business case that funded the project. The accounting framework used to justify automation has no line item for work the automation itself generates. The costs are real, and they accumulate where no instrument exists to catch them.

The Invisible Overhead

The Work the Spreadsheet Can't See

By Nora Kaplan— March 25, 2026

A single agent step running at 95% reliability sounds fine. Chain twenty steps and you're below 36%. That gap has to be managed by someone: prompt maintenance, drift detection, failure triage across layers that didn't exist before deployment. None of it appears in the business case that funded the project. The accounting framework used to justify automation has no line item for work the automation itself generates. The costs are real, and they accumulate where no instrument exists to catch them.

The Builder Profile

Sumeet Vaidya and the Distance Between Writing Code and Shipping It

By Rina Takahashi— March 25, 2026

Feature image for article: Sumeet Vaidya and the Distance Between Writing Code and Shipping It

An AI agent writes a code change in seconds. It compiles. It passes the sandbox. It touches a database schema, a caching layer, an auth service, and nobody finds out whether it actually works until the cost of finding out has already multiplied. Sumeet Vaidya spent a decade at Facebook, Uber, and Discord watching that distance between "looks right" and "works in production" grow wider with every new service dependency. With Crafting, he's placed a very specific bet on where the wall is, and it lives in the space between generated code and the production environment that has to accept it.

The Builder Profile

Sumeet Vaidya and the Distance Between Writing Code and Shipping It

By Rina Takahashi— March 25, 2026

An AI agent writes a code change in seconds. It compiles. It passes the sandbox. It touches a database schema, a caching layer, an auth service, and nobody finds out whether it actually works until the cost of finding out has already multiplied. Sumeet Vaidya spent a decade at Facebook, Uber, and Discord watching that distance between "looks right" and "works in production" grow wider with every new service dependency. With Crafting, he's placed a very specific bet on where the wall is, and it lives in the space between generated code and the production environment that has to accept it.

The Practitioner's Day

The Professional Noticer Keeping AI Agents From Quietly Losing Their Minds

The Practitioner's Day

The Professional Noticer Keeping AI Agents From Quietly Losing Their Minds

The Maintenance Curve

The Agentic AI Cost Curve: Fast Builds, Slow Drowns

Gartner predicts over 40% of agentic AI projects face cancellation by end of 2027. Most will be narrated as technology failures. Look closer and the pattern is financial: teams that built fast discover they've inherited platform-scale obligations on a prototype-scale budget.

The trajectory is remarkably consistent. Ship an agent, wire up basic logging, call it supervised. Within months, evaluation suites, audit infrastructure, model migration cycles, and governance layers arrive uninvited. Engineering maintenance alone runs $3,000 to $6,000 monthly per mid-complexity agent. Development environments, with their clean data and cooperative inputs, never hinted at any of this.

By the time the true operating cost surfaces, the project is already under executive scrutiny with no clean exit.

The Maintenance Curve

The Agentic AI Cost Curve: Fast Builds, Slow Drowns

Gartner predicts over 40% of agentic AI projects face cancellation by end of 2027. Most will be narrated as technology failures. Look closer and the pattern is financial: teams that built fast discover they've inherited platform-scale obligations on a prototype-scale budget.

The trajectory is remarkably consistent. Ship an agent, wire up basic logging, call it supervised. Within months, evaluation suites, audit infrastructure, model migration cycles, and governance layers arrive uninvited. Engineering maintenance alone runs $3,000 to $6,000 monthly per mid-complexity agent. Development environments, with their clean data and cooperative inputs, never hinted at any of this.

By the time the true operating cost surfaces, the project is already under executive scrutiny with no clean exit.

TAKE NOTE

Eval gap: 89% of teams have agent observability running, but only 52% do systematic evaluations, so most can watch agents act without ever validating correctness

Testing costs: Non-deterministic behavior means every prompt change triggers thousands of simulation reruns, pushing per-agent evaluation into the tens of thousands

Model churn: Budget for one to two model migrations yearly, each consuming up to two weeks of engineering time and restarting the full evaluation cycle

Cognitive debt: Researchers now separate debt in code from debt in developers' minds, where fast-built agent systems quietly erode shared understanding of how anything works

Remediation market: Gartner anticipates specialized tools and consulting services emerging specifically to audit and refactor AI-generated technical debt at enterprise scale