Foundations

Foundations

The Non-Determinism Budget

A system with 95% reliability at each model-directed step delivers 36% reliability over twenty steps. The math is just multiplication. But the agent ecosystem has no standard way to measure this decay, and new research shows why it's so insidious: failed runs are statistically indistinguishable from successful ones through most of their execution. The budget is spent long before anyone can tell. Experienced teams navigate this constraint well. They navigate it by instinct, without a number, making consequential architecture decisions about a resource they've never quantified.

The Non-Determinism Budget
Asystem with 95% reliability at each model-directed step delivers 36% reliability over twenty steps. The math is just multiplication. But the agent ecosystem has no standard way to measure this decay, and new research shows why it's so insidious: failed runs are statistically indistinguishable from successful ones through most of their execution. The budget is spent long before anyone can tell. Experienced teams navigate this constraint well. They navigate it by instinct, without a number, making consequential architecture decisions about a resource they've never quantified.
Where Intelligence Lives in Each Step

A login form needs zero intelligence. A Playwright selector finds the username field, fills it, clicks "Sign In" in milliseconds. Two steps later, the same workflow hits a CAPTCHA rendered as a bitmap inside a canvas element. Now you need a vision model looking at a screenshot. Same workflow, wildly different requirements per step. Browser automation has three layers of intelligence available at every step, from free and instant to expensive and flexible. How a system moves between them mid-run is the design choice that actually holds up or doesn't.
Where Intelligence Lives in Each Step
Alogin form needs zero intelligence. A Playwright selector finds the username field, fills it, clicks "Sign In" in milliseconds. Two steps later, the same workflow hits a CAPTCHA rendered as a bitmap inside a canvas element. Now you need a vision model looking at a screenshot. Same workflow, wildly different requirements per step. Browser automation has three layers of intelligence available at every step, from free and instant to expensive and flexible. How a system moves between them mid-run is the design choice that actually holds up or doesn't.

Further Reading





