Open on desktop
Antimetal's interactive diagrams require a larger screen. Open this page on your laptop or desktop to continue.
API Gateway
An API Gateway is the front door to your microservices. It handles cross-cutting concerns that every service needs: authentication, rate limiting, SSL termination, request routing, and response transformation. Without it, every microservice implements auth independently.
§1Step 1 — Understand the Problem
§2Step 2 — High-Level Design
Client → Load Balancer → API Gateway (Auth Check via Redis, Rate Limit Counter via Redis) → [User Service | Product Service | Order Service]
Every client request hits the load balancer first, which distributes traffic across multiple gateway instances. Each gateway instance validates the Bearer token against Redis (sub-millisecond lookup), checks the rate limit counter, then forwards the request to the appropriate upstream service.
# All requests pass through the gateway on port 443
POST /v1/users/register → User Service
GET /v1/users/:id → User Service
GET /v1/products → Product Service
GET /v1/products/:id → Product Service
POST /v1/orders → Order Service
GET /v1/orders/:id → Order Service
# Headers added by gateway before forwarding:
X-User-ID: <validated-user-id>
X-Request-ID: <uuid>
X-Forwarded-For: <client-ip>The gateway strips the Authorization header before forwarding — services receive a trusted X-User-ID header instead. This keeps JWT validation logic in one place and prevents services from accidentally trusting unvalidated tokens.
§3Step 3 — Deep Dive
Two decisions dominate gateway performance: how you validate tokens and which rate limiting algorithm you use. Both have hard latency requirements and must survive Redis being temporarily unavailable.
| Algorithm | Memory per client | Allows burst? | Accuracy | Winner? |
|---|---|---|---|---|
| Token Bucket | ~24 bytes | Yes (up to bucket size) | Exact | No — complex to implement in Redis atomically |
| Leaky Bucket | ~16 bytes | No (fixed output rate) | Exact | No — rejects legitimate burst traffic |
| Fixed Window Counter | ~8 bytes | 2× at boundary | Approximate | No — double-spend attack at window reset |
| Sliding Window Counter | ~16 bytes | Partial (weighted) | ~99% accurate | Yes — simple INCR + EXPIRE, no boundary exploit |
Rate limiting algorithm comparison — pick sliding window counter for this system.
-- Called on every request. Returns 1 if allowed, 0 if rate limited.
local key = KEYS[1] -- e.g. "rl:user123:minute"
local limit = tonumber(ARGV[1]) -- e.g. 100
local window = tonumber(ARGV[2]) -- e.g. 60 (seconds)
local count = redis.call('INCR', key)
if count == 1 then
redis.call('EXPIRE', key, window)
end
if count > limit then
return 0 -- rate limited
end
return 1 -- allowed| Option | Latency overhead | Ops complexity | Plugin ecosystem | Best for |
|---|---|---|---|---|
| NGINX + Lua | <1ms | Low | Manual | Simple routing, low traffic (<1K RPS) |
| Kong (OSS) | 1–3ms | Medium | Rich (150+ plugins) | This system — 10K RPS, plugin-driven auth |
| Envoy Proxy | 1–2ms | High | gRPC-first | Service mesh, polyglot microservices |
| AWS API Gateway | 5–15ms | Very low | AWS-native | Serverless backends, AWS lock-in acceptable |
| Custom Go proxy | <0.5ms | High | None | Latency-critical, unique requirements |
Gateway implementation options — Kong wins for this scale.
§4Step 4 — Wrap Up
| Decision | Choice | Why |
|---|---|---|
| Auth strategy | JWT validated at gateway, X-User-ID forwarded | Centralized auth, services stay stateless |
| Token cache | Redis with TTL matching JWT expiry | Sub-ms validation, automatic invalidation on expiry |
| Rate limiting | Sliding window counter in Redis Lua script | Atomic, accurate, no boundary exploits |
| Gateway software | Kong OSS | Plugin ecosystem covers auth, rate limiting, logging out of the box |
| Availability | 3+ instances + Redis fallback to in-memory | No single point of failure on the critical path |
Key decisions summary.