Foundations

Foundations

Where Agent Behavior Actually Gets Decided

A developer's coding agent worked perfectly for thirty minutes, then started contradicting its own decisions. The prompt was the same. The model was the same. Everything else in the context window had been accumulating. His fix was radical: stop carrying context forward entirely. It worked. Across production agent systems, the same asymmetry kept surfacing — a carefully engineered prompt drowning in accumulated tool outputs performed worse than a simple prompt in a clean, focused context. The field needed a word for the thing practitioners were actually optimizing, because "prompt" didn't describe it.
Where Agent Behavior Actually Gets Decided
Adeveloper's coding agent worked perfectly for thirty minutes, then started contradicting its own decisions. The prompt was the same. The model was the same. Everything else in the context window had been accumulating. His fix was radical: stop carrying context forward entirely. It worked. Across production agent systems, the same asymmetry kept surfacing — a carefully engineered prompt drowning in accumulated tool outputs performed worse than a simple prompt in a clean, focused context. The field needed a word for the thing practitioners were actually optimizing, because "prompt" didn't describe it.

What Browser Agents See Before They Think

Before a browser agent reasons about a task, it needs a representation of the page. Raw HTML, a screenshot, or the accessibility tree. That choice, made once and early, sets the token budget, failure modes, and reliability ceiling for everything downstream. OpenAI, Microsoft, and Perplexity built their agent systems independently and all three landed on the same primary perception layer: infrastructure originally designed for people who can't see screens. The convergence is worth understanding.

What Browser Agents See Before They Think
Before a browser agent reasons about a task, it needs a representation of the page. Raw HTML, a screenshot, or the accessibility tree. That choice, made once and early, sets the token budget, failure modes, and reliability ceiling for everything downstream. OpenAI, Microsoft, and Perplexity built their agent systems independently and all three landed on the same primary perception layer: infrastructure originally designed for people who can't see screens. The convergence is worth understanding.
Further Reading




Past Articles

A login form needs zero intelligence. A Playwright selector finds the username field, fills it, clicks "Sign In" in mill...

A system with 95% reliability at each model-directed step delivers 36% reliability over twenty steps. The math is just m...

Framework comparison tables show you what's available. The interesting part starts when a container restarts mid-workflo...

Claude Opus 4 scores 64.9% on GAIA in one scaffold and 57.6% in another. Same model, same benchmark, same questions. The...

