Open on desktop
Antimetal's interactive diagrams require a larger screen. Open this page on your laptop or desktop to continue.
URL Shortener
§1Step 2 — High-Level Design
Build a URL shortening service from a single server to a globally distributed system handling billions of redirects.
Place Redis between API servers and database to cache the mapping of short codes to original URLs.
Redis is an in-memory cache that stores key-value pairs for nanosecond lookup times. For URL shorteners, it caches the mapping of short codes (e.g., 'abc123') to full URLs.
URL expansion is a hot-path read. At 1000 RPS, 80% of traffic hits 20% of URLs. Caching these hot URLs saves 10-50ms database round trips. A single Redis node handles 1M reads/sec at < 1ms latency.
Cache invalidation is necessary if a URL is modified or deleted. Use TTL-based expiry (24-30 days typical) or invalidate on explicit delete.
Bit.ly caches 95% of URL expansions in Redis. TinyURL caches top 10k URLs. Twitter's t.co uses multi-level caching to serve 150k redirects/sec from memory.
A 16GB Redis node caches ~4M average-sized URL mappings. At 80/20 distribution, this covers the hot set for services up to 10M total URLs.
Distribute database read load across a primary and replica. Writes go to primary; reads go to replica.
A read replica is a database copy that receives updates from the primary and serves read-only queries. At 100:1 read-to-write ratio, splitting reads across replicas linearly increases throughput.
A single Postgres instance tops out at ~3000 queries/sec depending on hardware. With 1000 RPS and 100:1 read-write split, the database alone becomes the bottleneck. Replicas multiply read capacity.
Replication lag (typically < 10ms) means replicas serve slightly stale data. For URL shorteners, this is acceptable—URLs don't change frequently.
Bit.ly uses read replicas for user analytics queries while keeping redirects in cache. Large services often deploy 3-5 replicas per primary.
Each replica adds ~3000 read queries/sec capacity. At 900 RPS reads, you need at least 1 replica; at peak, 2-3 replicas recommended.
Pre-generate unique short codes in a background worker to avoid contention on the database during URL creation.
A background worker service pre-generates unique short codes (e.g., using base-62 encoding of a counter) and stores them in a pool. API servers consume from this pool during URL creation.
Generating unique IDs requires coordination. A naive counter has a single point of contention; a worker-based pool reduces lock contention and enables offline generation.
Requires an extra service and coordination mechanism. If ID generation fails, the pool depletes. Implement backpressure to stop accepting new URLs when the pool is low.
YouTube uses a distributed ID generation service (Snowflake-style). Uber generates IDs in batches to reduce coordination overhead.
A worker can pre-generate 100k IDs/sec. A pool of 1M pre-generated IDs provides ~15 seconds of runway at 1000 RPS.
§2Step 3 — Deep Dive
Redis is an in-memory cache that stores key-value pairs for nanosecond lookup times. For URL shorteners, it caches the mapping of short codes (e.g., 'abc123') to full URLs.
| Approach | Read latency | Write complexity | Collision risk | Best for | Cost | Ops burden |
|---|---|---|---|---|---|---|
| Random ID (6 chars) | < 5ms with cache | Low | Low at small scale | Simple shorteners ✓ | Low | Low |
| Base62 counter | < 5ms with cache | Low | None | Sequential, predictable IDs | Low | Low |
| MD5/SHA hash (truncated) | < 5ms with cache | Low | Yes, needs retry | Deduplication needed | Low | Low |
| Custom alias | < 5ms with cache | Medium | None (user-defined) | Branded short links | Low | Low |
| Snowflake ID | < 5ms with cache | High | None | Distributed, time-ordered | Low | Low |
URL Shortener — storage and redirect strategy trade-offs.
import { createClient } from 'redis'
import { Pool } from 'pg'
const redis = createClient()
const pg = new Pool({ connectionString: process.env.DATABASE_URL })
export async function redirect(shortCode: string): Promise<string | null> {
// 1. Check Redis cache first (hot URLs serve in < 1ms)
const cached = await redis.get(`url:${shortCode}`)
if (cached) return cached
// 2. Fallback to Postgres
const { rows } = await pg.query(
'SELECT original_url FROM urls WHERE short_code = $1',
[shortCode]
)
if (!rows[0]) return null
// 3. Populate cache with 24h TTL
await redis.setEx(`url:${shortCode}`, 86400, rows[0].original_url)
return rows[0].original_url
}
export async function shorten(originalUrl: string): Promise<string> {
const shortCode = Math.random().toString(36).slice(2, 8)
await pg.query(
'INSERT INTO urls (short_code, original_url) VALUES ($1, $2)',
[shortCode, originalUrl]
)
return shortCode
}| Component | Why Add It | Tradeoff |
|---|---|---|
| Redis Cache | URL expansion is a hot-path read. | Cache invalidation is necessary if a URL is modified or deleted. |
| Read Replica | A single Postgres instance tops out at ~3000 queries/sec depending on hardware. | Replication lag (typically < 10ms) means replicas serve slightly stale data. |
| Worker Service for ID Generation | Generating unique IDs requires coordination. | Requires an extra service and coordination mechanism. |
Design decision tradeoffs
Primary database goes down. Can users still access previously shortened URLs from cache?
Redis cache crashes. Thousands of concurrent redirect requests hit the database simultaneously. Does the system handle the load spike without cascading failure?
Viral URL creates 10x traffic spike (10k RPS). Load balancer is overwhelmed. Does the system gracefully degrade or queue requests? What's the max sustainable load?
§3Step 4 — Wrap Up
| Decision | Choice | Why |
|---|---|---|
| Redis Cache | Redis is an in-memory cache that stores key-value pairs for nanosecond lookup times. | URL expansion is a hot-path read. |
| Read Replica | A read replica is a database copy that receives updates from the primary and serves read-only queries. | A single Postgres instance tops out at ~3000 queries/sec depending on hardware. |
| Worker Service for ID Generation | A background worker service pre-generates unique short codes (e. | Generating unique IDs requires coordination. |
Key design decisions