OpenAI launched AgentKit as "a complete set of tools for developers and enterprises to build, deploy, and optimize agents." The pitch: solve fragmented tooling, eliminate weeks of frontend work, cut development time by half.
You get visual workflow builders, connector registries, and evaluation hooks. These package existing capabilities into more accessible forms. Useful? Yes. Complete? Only if you're building demos.
What's missing from the announcement: pricing models for inference at scale, security frameworks for prompt injection vulnerabilities, governance structures for multi-agent systems running continuously. The cited success stories—Klarna handling two-thirds of support tickets, Clay's 10x growth—preceded AgentKit's launch. These companies succeeded despite tooling fragmentation, not because someone solved it.
Running agents at scale still requires technical teams to develop and tune systems, hybrid approaches mixing quick wins with long-term architecture, and governance frameworks that treat agents as production infrastructure.
AgentKit lowers the barrier to experimentation. But it shifts complexity from orchestration to operations. The hard parts remain hard, just differently packaged.
