What Is an Agent Gateway? The Data Plane for Agentic AI

Abubakar Siddiq Ango Senior Developer Advocate

Jun 5, 2026 4 min read Beginner

Prerequisites

Basic understanding of Kubernetes concepts (Services, Ingress/Gateway API)
Familiarity with the idea of an API gateway or reverse proxy
A high-level sense of what AI agents and the Model Context Protocol (MCP) are

Introduction

Building with AI agents looks deceptively simple at first. You give an agent a tool, it calls the tool, it returns an answer. But the moment you have more than one agent, more than one model provider, or more than a couple of tools, you are running a small distributed system — and it generates network traffic that nobody is governing.

This is the problem an agent gateway solves. If you have used an API gateway before, the idea will feel familiar: a single proxy that sits in front of your services and applies routing, authentication, and observability in one place. An agent gateway does the same job, but for the traffic patterns that agentic systems create.

What is an agent gateway?

An agent gateway is a unified data plane for agent traffic. It is a proxy that sits between your agents and everything they communicate with, and applies consistent routing, security, and observability across several kinds of traffic at once:

MCP — the Model Context Protocol calls an agent uses to discover and invoke tools.
Agent-to-Agent (A2A) — messages between agents, often across different frameworks.
LLM inference — calls to hosted model providers or to your own self-hosted models.
Service traffic — the ordinary HTTP, gRPC, and REST APIs agents still depend on.

The gateway centralizes those concerns, so no agent re-implements authentication, retries, logging, and policy for each protocol. agentgateway, an open-source project (Apache-2.0) now hosted by the Linux Foundation’s Agentic AI Foundation, describes this as “one high-performance gateway for service, LLM, and MCP traffic.”

How is an agent gateway different from an API gateway?

	API gateway	Agent gateway
Primary traffic	North-south HTTP/REST	MCP, A2A, LLM inference, and HTTP/gRPC
Unit of work	A request	An agent action (tool call, inference, hand-off)
Routing	Path/host → backend	Tool discovery, model/provider routing, inference routing
Policy concern	Who may call this endpoint	Which agent may use which tool, with which arguments
Observability	Request logs, latency	Tool-call audit, token usage, cross-agent traces

The old concerns remain. On top of them, the unit of work becomes an agent taking an action, which raises the stakes and adds new routing and policy needs.

Key concepts

The unified data plane

One proxy handles multiple protocols (MCP, A2A, LLM, HTTP/gRPC) so policy and observability stay consistent across them, and the “AI parts” run through the same stack as everything else.

MCP gateway

A governed front door to your tools: discovery, RBAC, and audit logging for every Model Context Protocol call.

LLM gateway

One endpoint in front of multiple model providers, with token budgeting, caching, and failover, so switching providers is a configuration change with no agent code to rewrite.

Inference routing

Latency- and cost-aware routing of inference across self-hosted model servers (e.g. vLLM, TGI, Triton) on your own GPUs.

Policy and observability

Security controls (mTLS, authn/authz, policy-as-code) and telemetry (OpenTelemetry) applied uniformly to agent and traditional traffic.

When should you use an agent gateway?

You have more than one model provider and want routing, budgeting, and failover without rewriting agents.
You expose tools to agents over MCP and need access control and an audit trail.
You run agents across frameworks and need them to communicate over A2A with consistent security.
You self-host inference and want to use GPU capacity deliberately.
You need to govern or debug agent behavior — one place that can answer “what did the agent do?”

If your agentic system is a single agent calling a single tool, you do not need a gateway yet. The value shows up as soon as the system has more than one of anything.

Where Kubernetes fits

Agent gateways are typically built to run on Kubernetes. agentgateway, for instance, deploys via Helm and the Gateway API and is Envoy-compatible. The agentic data plane behaves like the rest of your platform. You deploy it declaratively, scale it horizontally, govern it with policy-as-code, and connect it to the networking you already run. Running it on Kubernetes keeps your agents’ most sensitive traffic on infrastructure you already operate and control, under one security and operational model.

Next steps

This is the first tutorial in the Agent Gateway on Kubernetes series. Next, we will deploy an agent gateway onto a cluster and route our first MCP tool call through it.

Upcoming in this series: Installing an Agent Gateway on Kubernetes → Your First MCP Gateway → Routing LLM Traffic Across Providers → Agent-to-Agent Communication → Securing Agent Traffic with Policy and Observability.

Summary

An agent gateway is a unified data plane for agent traffic: MCP, agent-to-agent, LLM inference, and ordinary service calls.
It centralizes routing, security, and observability so each agent does not reinvent them — and it is the natural enforcement point for which agent may do what.
The main capabilities are MCP tool governance, multi-provider LLM routing, GPU-aware inference routing, A2A bridging, and unified policy/observability.
It is built for Kubernetes, keeping the agentic control plane on infrastructure you operate.