The Gateway Layer: Portkey, LiteLLM, Kong, and Cloudflare Compared

By Carey Whitten— May 5, 2026

Platform engineering leads are naming Portkey, LiteLLM, Kong AI Gateway, and Cloudflare AI Gateway in the same breath as their AI infrastructure decisions — sometimes in the same sentence as their Okta deployment. You'll encounter these in RFIs, in architecture reviews, and in the occasional discovery call where a CISO wants to know how the organization is governing AI API access before they'll approve a rollout. The category name that travels well in those conversations is AI gateway. Gateway maps to infrastructure-class thinking, which is where these conversations need to live. A working comparison of the four products you're most likely to hear named follows, organized around the three dimensions that actually drive platform decisions: deployment model, enterprise readiness, and integration surface.

Portkey

What it is: A commercial AI gateway platform built for production deployments, offered primarily as SaaS with a self-hosted option.

What it does: Portkey sits between an application and its LLM providers, presenting a unified API endpoint that handles routing, fallback logic, semantic caching, and request-level observability. When a provider goes down or returns errors above a configured threshold, Portkey routes to a fallback without requiring application-side changes. Its guardrails layer lets platform teams define what content can enter and exit the gateway — input filtering, output validation, PII detection — as configurable policy rather than application code. Usage logs are structured and exportable, which is what makes cost attribution downstream possible.

Who's behind it: Portkey.ai, a venture-backed company founded in 2023. The product has moved quickly from developer tooling toward enterprise positioning, with SOC 2 Type II certification and a dedicated enterprise tier that includes dedicated infrastructure and SLA commitments.

What makes it distinct: Portkey is the most opinionated of the four about what happens to the request after it's intercepted. The guardrails and observability layer is baked into the product's core rather than bolted on as a plugin. For organizations that need to enforce content policy at the infrastructure layer rather than the application layer, that's a meaningful architectural difference.

LiteLLM

What it is: An open-source Python proxy that presents a unified, OpenAI-compatible API across more than 100 LLM providers, with a commercial managed offering available.

What it does: LiteLLM's core function is provider abstraction. Any application that can call the OpenAI API can call LiteLLM without modification; LiteLLM handles the translation to whatever provider is actually serving the request. The proxy server mode adds key management, per-team spend tracking, load balancing across model deployments, and configurable routing rules. Teams can set budget limits per virtual key, which is how LiteLLM addresses the cost visibility problem without requiring a separate FinOps tool for basic guardrails.

Who's behind it: BerriAI, the company that maintains the open-source project. The open-source core is MIT-licensed; the enterprise offering adds a management UI, SSO support, and commercial support contracts. The project has substantial community adoption. It consistently surfaces in practitioner discussions on Hacker News and r/LocalLLaMA as the default starting point for teams building multi-provider AI infrastructure.

What makes it distinct: Provider breadth and the open-source model. 100+ providers is not a marketing number. It reflects genuine community contribution to the translation layer. For organizations that need to hedge across providers, or that are running models on-premises alongside cloud providers, LiteLLM's coverage is difficult to match. The open-source core also means teams can inspect and modify the proxy behavior in ways that SaaS products don't permit.

Kong AI Gateway

What it is: Kong's AI-specific extension of their enterprise API gateway platform, delivered as a plugin layer on top of Kong Gateway.

What it does: Kong AI Gateway adds LLM-specific capabilities — provider routing, semantic caching, prompt templating, AI-specific rate limiting — to Kong's existing API management infrastructure. Organizations already running Kong Gateway get AI capabilities through plugin configuration rather than a separate product deployment. The AI plugins handle request transformation (injecting system prompts, sanitizing inputs), provider failover, and response caching based on semantic similarity rather than exact-match. Kong's existing traffic management, authentication plugins, and observability integrations remain available and apply to AI traffic the same way they apply to any other API traffic.

Who's behind it: Kong Inc., the API management company that has been in enterprise infrastructure since 2015. Kong Gateway is deployed in a substantial portion of large enterprise environments; the company's own figures cite tens of thousands of organizations. AI Gateway is an extension of that installed base, not a greenfield product.

What makes it distinct: The enterprise pedigree and the existing deployment footprint. Kong AI Gateway doesn't ask an enterprise to adopt new infrastructure. It asks them to extend infrastructure they already trust, already have support contracts for, and already have in their network architecture. For organizations where Kong is already the API management standard, AI Gateway is the path of least resistance to governing AI traffic.

Cloudflare AI Gateway

What it is: Cloudflare's edge-native AI traffic management layer, running on Cloudflare's global network rather than in customer infrastructure.

What it does: Cloudflare AI Gateway provides logging, caching, rate limiting, and analytics for AI API calls, with the gateway sitting at Cloudflare's network edge rather than inside the customer's environment. Requests to LLM providers route through Cloudflare's network, where the gateway applies policy and captures telemetry before the request exits to the provider. Semantic caching at the edge means repeated or similar queries can be served from cache without reaching the provider at all, a meaningful cost reduction for high-volume deployments. The product integrates natively with Cloudflare Workers, allowing teams to add custom logic in the request path using Cloudflare's serverless runtime.

Who's behind it: Cloudflare. The AI Gateway product is part of Cloudflare's developer platform, positioned alongside Workers AI and Vectorize as infrastructure for AI-native applications. Cloudflare's enterprise posture — SOC 2, ISO 27001, FedRAMP authorization for relevant products — applies to the platform broadly.

What makes it distinct: The deployment model is the differentiator. Cloudflare AI Gateway is the only product in this group that runs at the network layer rather than in application infrastructure. That means no deployment, no maintenance, and network-level visibility that self-hosted options can't replicate. The constraint is the inverse: organizations that cannot route AI traffic through a third-party network, due to data residency requirements, classification concerns, or existing network architecture, cannot use Cloudflare AI Gateway regardless of its other merits.

Comparison: How the Four Stack Up

Trait-led analysis across three dimensions. Each dimension covers all four products. Conclusions are circumstance-specific.

Deployment Model

This dimension produces the clearest separation among the four products.

Cloudflare AI Gateway is edge-hosted. It runs on Cloudflare's network, period. There is no self-hosted option. That's either a feature or a disqualifier depending on the organization's network posture and data handling requirements.

LiteLLM is self-hosted first. The open-source proxy runs in the customer's environment, which means the customer controls the deployment, the data path, and the configuration. The managed cloud offering exists, but the product's architecture and community assume self-hosted as the default.

Portkey is SaaS-first with a self-hosted option. The SaaS path is faster to deploy; the self-hosted path exists for organizations with data residency requirements or a preference for infrastructure they operate directly.

Kong AI Gateway deploys wherever Kong Gateway deploys — on-premises, in a customer-managed cloud environment, or through Kong's Konnect managed offering. For organizations already running Kong, the deployment model is already decided.

For public sector environments where data handling requirements constrain the network path, the practical field is Portkey (self-hosted), LiteLLM (self-hosted), and Kong (self-hosted or Konnect). Cloudflare AI Gateway requires a separate conversation about whether the organization's AI traffic can route through Cloudflare's network.

Enterprise Readiness

Enterprise readiness is a composite of compliance posture, support model, and organizational longevity — the factors a procurement office actually evaluates.

Kong carries the highest baseline. It has enterprise contracts, established support tiers, and a decade of deployment in large organizations. Procurement teams have processed Kong before; that familiarity has real value in procurement timelines.

Cloudflare's enterprise posture is strong, inherited from its broader platform. Organizations already using Cloudflare for CDN or security have an existing vendor relationship to extend.

Portkey has moved quickly toward enterprise readiness — SOC 2 Type II, dedicated infrastructure options, enterprise SLAs — but it's a younger company. Procurement teams doing vendor risk assessments will ask questions that Kong and Cloudflare answer with longer track records.

LiteLLM's enterprise readiness is the most variable. The open-source core is not a liability in itself, but enterprise readiness for an open-source product depends on how it's deployed and whether the organization has a support contract. LiteLLM's commercial offering addresses this; the open-source-only deployment does not.

Integration Surface

Integration surface determines how much friction the gateway adds to the existing environment and how much leverage it provides.

Kong's integration surface is the broadest for organizations already in the Kong ecosystem. Existing Kong plugins for authentication, rate limiting, logging, and service mesh integration apply to AI traffic. Kubernetes-native deployment through Kong's Ingress Controller means AI gateway policy can be expressed in the same configuration language as the rest of the organization's API management.

LiteLLM's integration surface is defined by its OpenAI-compatible endpoint. Any tool, framework, or application that speaks the OpenAI API speaks LiteLLM — essentially the entire LLM application ecosystem — without requiring any integration work.

Portkey offers SDK integrations for major languages, an OpenAI-compatible endpoint, and webhook support for routing events to external systems. The integration surface is narrower than Kong's but broader than Cloudflare's outside the Cloudflare ecosystem.

Cloudflare AI Gateway integrates tightly with Cloudflare Workers, R2, and D1. Outside that ecosystem, the integration surface is lighter. Organizations building on Cloudflare Workers get deep integration; organizations running workloads elsewhere get logging and caching without the same depth of programmability.

Field Language Guide

Don't say	Do say	Why it matters
"LLM proxy"	"AI gateway"	Proxy signals developer tooling; gateway signals infrastructure-class decision, which is the conversation you want to be in
"It handles identity"	"It centralizes API key management — identity integration is a separate layer"	The gateway governs API keys and traffic policy; who the end user is requires an identity layer above the gateway
"It tracks AI spend"	"It captures usage data that feeds cost attribution — chargeback is a downstream decision"	The gateway logs; what the organization does with those logs is a separate FinOps conversation
"Which one is most secure?"	"What's your deployment model — cloud-hosted, self-hosted, or edge?"	Security posture follows deployment model; the reframe opens discovery instead of closing it
"It replaces your API management"	"It extends your API management to cover AI traffic"	Kong customers especially will hear replaces as a threat to an existing investment
"It's AI security tooling"	"It's AI infrastructure — security is one of several functions it serves alongside routing, observability, and cost visibility"	Security-only framing undersells the product category and misaligns with how platform engineering leads think about it
"LiteLLM is free"	"LiteLLM has an open-source core; enterprise support and the managed offering are commercial"	Open source doesn't mean no cost in enterprise deployments, and that assumption will surface in procurement
"Cloudflare AI Gateway is just caching"	"Cloudflare's edge model adds network-layer visibility that self-hosted options can't replicate"	Underselling the architectural difference undersells the product and misrepresents the tradeoff
"We need to pick one"	"Some organizations run more than one gateway for different workloads — what's the primary deployment context?"	Multi-gateway environments are common; the question opens discovery rather than forcing a premature decision
"It's like a firewall for AI"	"It's a centralized control plane for AI API traffic — routing, policy, and observability in one layer"	The firewall analogy implies block/allow logic; the gateway does considerably more than that

“

Okta Concept Mapping

The AI gateway pattern rhymes with the Policy Enforcement Point in a zero trust architecture: traffic passes through a centralized layer that applies policy before the request reaches its destination, and no traffic bypasses that layer. The analogy holds for the interception model and the centralized policy application. Where it breaks is on identity. A PEP makes authorization decisions based on who is making the request, because it knows the authenticated principal. An AI gateway, in its base configuration, knows which API key or service account is calling, but it does not know the end user behind that request. The gateway enforces team-level or application-level policy; user-level attribution requires an identity layer that sits above the gateway. That distinction matters in buyer conversations because a CISO asking "how do we know who's using AI" is asking an identity question, and the gateway layer alone doesn't answer it.