How API Gateways Differ from Load Balancers
Modern platform architectures frequently conflate traffic distribution with application routing. While both components sit at the network edge, their operational boundaries are strictly defined by the OSI layer they inspect and the policies they enforce. Load balancers primarily manage connection distribution and health checking at L3/L4, with optional L7 passthrough capabilities. API gateways function as application-aware reverse proxies that parse payloads, translate protocols, and execute declarative routing policies. Understanding this divergence is foundational when designing resilient edge topologies, as detailed in API Gateway Fundamentals & Architecture.
Traffic Distribution vs. Application-Aware Routing
Load balancers route based on transport-layer metrics: round-robin, least connections, IP hash, or basic HTTP path matching. They forward packets transparently, preserving upstream headers and payloads. API gateways intercept the full HTTP request, parse headers, decode bodies, and route based on application semantics. This enables content-based routing: directing traffic by JWT claims, gRPC method names, GraphQL operation types, or custom header values.
Routing Evaluation Workflow:
- Intercept & Buffer: Gateway terminates the client connection and buffers the request body.
- Extract Routing Keys: Parse
Authorization(JWT claims),Content-Type(protocol type), or custom headers (X-Tenant-ID). - Match Conditions: Evaluate extracted keys against declarative routing tables.
- Dispatch: Establish upstream connection only after policy validation passes.
Declarative Routing Syntax (Production Example):
routes:
- match:
headers:
- name: "x-api-version"
exact_match: "v2"
jwt_claim:
- name: "role"
exact_match: "admin"
route:
cluster: "admin-service-v2"
timeout: 500ms
The routing engine evaluates these conditions before establishing upstream connections, introducing measurable but necessary latency for policy enforcement.
Protocol Translation & Payload Mutation
A critical architectural differentiator is protocol bridging. Load balancers maintain protocol parity; an HTTP/1.1 request forwarded through an L7 LB remains HTTP/1.1. Gateways actively translate protocols: HTTP/1.1 to HTTP/2, REST to gRPC, or synchronous HTTP to asynchronous message queues.
During translation, gateways mutate headers (Host, X-Forwarded-Proto, Content-Length), manage connection pooling, and handle chunked encoding. Misconfigured translation often triggers upstream 400 Bad Request, 413 Payload Too Large, or 502 Bad Gateway responses when the backend expects unmodified payloads.
Header Mutation & Bridging Configuration:
# Gateway acts as a protocol adapter, not a transparent pipe
location /api/v1/grpc {
grpc_pass grpc://upstream_grpc_cluster;
grpc_set_header X-Forwarded-Proto $scheme;
grpc_set_header Content-Length ""; # Required for chunked HTTP/1.1 to gRPC
grpc_set_header TE ""; # Strip unsupported HTTP/1.1 trailers
}
Exact syntax for bridging varies by implementation (e.g., Envoy grpc_pass vs. Kong grpc-gateway), but the underlying behavior remains consistent. Validate that upstream services accept gateway-injected headers and that payload size limits align across the gateway and backend.
Policy Enforcement & Security Boundaries
Load balancers rely on external WAFs, IAM proxies, or custom middleware for security. Gateways embed policy execution directly in the data plane. They terminate mTLS, validate OIDC tokens, enforce token-bucket rate limiting, and apply request/response transformation rules before traffic reaches upstream services. This shifts the security perimeter inward, enabling zero-trust architectures where identity verification precedes routing.
Policy Enforcement Workflow:
- Terminate TLS: Validate client certificate chain and extract SPIFFE/SAN identity.
- Authenticate: Verify OIDC/JWT signature against JWKS endpoint. Cache validation results to reduce latency.
- Authorize: Match extracted claims (
sub,scope,roles) against route-level ACLs. - Rate Limit: Apply sliding window or token bucket limits keyed by
client_idor IP. - Transform: Strip sensitive headers, inject tracing spans, and normalize payloads.
When evaluating throughput vs. latency tradeoffs for policy-heavy deployments, consult Gateway Selection Criteria to align enforcement depth with SLO requirements.
Common Failure Modes & Troubleshooting
Routing misconfigurations manifest predictably. Use this diagnostic workflow to isolate and resolve upstream failures.
Step-by-Step Resolution:
- 502 Bad Gateway after mTLS termination:
- Root Cause: Upstream certificate mismatch or missing
X-Forwarded-Client-Certheaders. - Resolution: Verify upstream CA bundle. Ensure gateway injects
X-Forwarded-Client-Cert(PEM-encoded) or configure upstream to skip client cert validation if handled at the edge.
- 429 Rate Limit False Positives:
- Root Cause: Rate limiter reads
X-Real-IPincorrectly behind a transparent LB. - Resolution: Configure gateway to trust
X-Forwarded-Forchain. Settrusted_proxiesCIDR ranges to prevent IP spoofing.
- gRPC Deadline Exceeded:
- Root Cause: Gateway connection pool exhaustion or missing
grpc-timeoutheader propagation. - Resolution: Increase
max_connections_per_host. Ensuregrpc-timeoutis forwarded or mapped to upstreamdeadlinemetadata.
Diagnostic Execution:
Enable structured access logs with request_id and trace upstream attempt headers:
# Filter logs for failed upstream attempts
grep "X-Envoy-Attempt: true" /var/log/gateway/access.log | jq '. | select(.status >= 500)'
Verify circuit breaker thresholds against LB health check intervals to prevent cascading failures during partial outages.
Architectural Decision Matrix
Deploy load balancers for raw TCP/UDP distribution, high-throughput L7 passthrough, and multi-AZ failover with minimal inspection overhead. Deploy API gateways when you require protocol translation, fine-grained auth, developer portal integration, or declarative routing policies.
Production Topology (Layered Approach):
Client -> [L4/L7 Load Balancer] -> [API Gateway Cluster] -> [Upstream Services]
- L4/L7 LB: Terminates public traffic, handles DDoS mitigation, manages connection pooling, and performs basic health checks.
- API Gateway: Receives pre-filtered traffic, executes application routing, enforces security policies, and handles protocol translation.
In production, they are frequently chained. This layered approach isolates connection distribution from application logic, simplifying scaling and failure isolation.