Open on desktop

Antimetal's interactive diagrams require a larger screen. Open this page on your laptop or desktop to continue.

Best on desktop

Back to lesson

Unique ID Generator

beginnerAlgorithmsDistribution

Fundamentals·20 min read

Unique ID Generator

1Understand the Problem & Establish Design Scope→2High-Level Design→3Deep Dive→4Wrap Up

AlgorithmsDistribution

§1Step 2 — High-Level Design

2High-Level Design

Design a distributed ID generation system. Compare UUID, Snowflake, and ULID approaches.

System architecture overview

Stage 1 of 7Starting state — the problem to solve

Progressive build — add each component step by step

Add a Load Balancer

Connect the API server to a load balancer that will distribute ID generation requests across the worker pool. Round-robin across healthy workers is the right default.

What it does

A load balancer sits between the API server and the ID generator worker pool, distributing requests with round-robin or least-connections routing and performing health checks on each worker.

Why it matters

At scale, a single generator is a ceiling on throughput and a single point of failure. The load balancer turns a fleet of independent workers into a single logical endpoint — clients don't need to know how many workers exist.

Trade-off

Round-robin can't guarantee perfectly even distribution if workers have different response times. Least-connections routing is more accurate under variable load, but adds state to the load balancer.

Real world

Twitter's Snowflake service ran behind HAProxy. Discord routes ID generation through their internal load balancing layer. Instagram's ID system uses a DNS-based round-robin approach for the initial request routing.

Capacity math

A well-tuned L4 load balancer handles 1M+ RPS with sub-millisecond overhead. At our 10K RPS target, the load balancer consumes less than 1% of its capacity.

In the real world: Twitter's Snowflake service ran behind HAProxy. Discord routes ID generation through their internal load balancing layer. Instagram's ID system uses a DNS-based round-robin approach for the initial request routing.

Add ID Generator Workers

Add three worker services implementing the Snowflake algorithm. Each worker is assigned a unique machine ID (0–1023) at startup and generates 64-bit IDs using only bitwise operations — no database, no coordination at runtime.

What it does

Snowflake workers implement a 64-bit ID format: 1 sign bit (always 0) + 41-bit millisecond timestamp + 10-bit worker ID + 12-bit per-millisecond sequence. Generation is pure in-memory bitwise ops — no I/O, no locks across workers.

Why it matters

UUID v4 is random — you lose time-ordering, which destroys index locality on inserts (B-tree pages scatter randomly). Auto-increment requires a single authoritative database. Snowflake IDs are time-ordered (newer = higher), globally unique, and generated at the edge without coordination.

Trade-off

Snowflake IDs encode approximate creation time — an attacker can infer when an object was created from its ID. If privacy matters, add a reversible shuffle layer on top (Instagram's ID system did this). Also: worker IDs must be unique — collision produces silent data corruption.

Real world

Twitter Snowflake (open-sourced 2010), Discord's snowflakes (same format, different epoch), Instagram IDs (Postgres-based with Snowflake bit layout), TikTok video IDs, Shopify order IDs — all Snowflake variants.

Capacity math

One worker: 4,096 IDs/ms = 4,096,000 IDs/sec. Three workers: 12M IDs/sec peak. At 10K RPS target, you're using 0.08% of capacity. Even at Twitter's peak of 150K tweets/second, three workers handle it comfortably.

In the real world: Twitter Snowflake (open-sourced 2010), Discord's snowflakes (same format, different epoch), Instagram IDs (Postgres-based with Snowflake bit layout), TikTok video IDs, Shopify order IDs — all Snowflake variants.

Add a Service Registry

Add a ZooKeeper or etcd node to act as the worker ID coordination service. Each Snowflake worker claims a unique worker ID lease on startup. Without this, two workers could be assigned the same ID — silently producing duplicate IDs.

What it does

A service registry (ZooKeeper, etcd, Consul) provides distributed coordination via leases. Each worker acquires a worker ID lease on startup using an atomic compare-and-swap operation — guaranteed exclusive across the cluster.

Why it matters

Worker ID collision is the worst failure mode: two workers silently generating IDs with the same worker ID will produce colliding IDs that are impossible to detect until a uniqueness constraint fails in your database — potentially hours later.

Trade-off

ZooKeeper adds a startup dependency — workers can't start if the registry is unreachable. Mitigate with fallback to a pre-configured static worker ID (written to local disk after first successful lease claim) so restarts don't require the registry.

Real world

Twitter's Snowflake service used ZooKeeper for worker ID assignment. Discord uses an environment-variable-based approach for simplicity (manually assigned in deployment config). Sonyflake (Go implementation) uses the machine's IP address as the worker ID.

Capacity math

The registry is consulted only at startup/shutdown — at runtime, zero requests go to it. Even with 1,024 workers starting simultaneously, a healthy etcd cluster handles this trivially.

In the real world: Twitter's Snowflake service used ZooKeeper for worker ID assignment. Discord uses an environment-variable-based approach for simplicity (manually assigned in deployment config). Sonyflake (Go implementation) uses the machine's IP address as the worker ID.

Add Monitoring

Add a monitoring node to track clock skew, ID generation rate per worker, sequence overflow events, and worker health. Clock skew is the only failure mode that can cause silent ID collisions — monitoring must alert on it immediately.

What it does

Monitoring collects: (1) IDs generated per second per worker, (2) sequence overflow events (clock waiting), (3) system clock offset vs NTP, (4) worker ID assignment events (lease claims/releases), (5) per-worker p99 generation latency.

Why it matters

Clock skew is silent. A worker generating IDs with a backward clock won't error — it will happily produce IDs that collide with IDs it issued in the future. Without monitoring, you only discover this when a duplicate key exception surfaces in your primary database.

Trade-off

Monitoring adds a small network overhead per metric emission. Use async, non-blocking metrics emission (UDP to StatsD, or background goroutine pushing to Prometheus) so the hot path (ID generation) is never blocked by the metrics path.

Real world

Twitter monitored Snowflake workers for clock skew and sequence overflow. Discord's ID system monitors generation rate to detect worker failures early. Cloudflare monitors their ID systems for drift as part of their SLO stack.

Capacity math

Metrics emission at 10K IDs/sec generates roughly 10 data points/sec per worker — negligible load. Clock drift alerts fire within seconds of drift exceeding a threshold (typically 1ms for Snowflake systems).

In the real world: Twitter monitored Snowflake workers for clock skew and sequence overflow. Discord's ID system monitors generation rate to detect worker failures early. Cloudflare monitors their ID systems for drift as part of their SLO stack.

Add a Load Balancer

At high ID generation volume, distribute requests across multiple ID generator nodes behind a load balancer.

What it does

A load balancer distributes ID generation requests across multiple generator nodes, each with a unique machine ID.

Why it matters

A single generator node can produce ~500K IDs/second. At high traffic with billions of IDs/day, you need multiple nodes.

Trade-off

Machine IDs must be unique and stable. Use node index assignment or coordinate via ZooKeeper for dynamic scaling.

Real world

Twitter's Snowflake runs multiple generator nodes, each with a unique datacenter+machine ID combination.

Capacity math

Each Snowflake-style node generates 4K IDs/ms (4M/second). Two nodes = 8M IDs/second.

In the real world: Twitter's Snowflake runs multiple generator nodes, each with a unique datacenter+machine ID combination.

Add a Cache for ID Batching

At peak, pre-generate batches of IDs into a Redis queue so generators can serve from cache without compute overhead.

What it does

A Redis cache holds pre-generated ID batches. Clients pop IDs from the list rather than triggering generation per-request.

Why it matters

At peak, even fast ID generation has compute overhead. Pre-batch generation smooths CPU usage and cuts P99 latency 10x.

Trade-off

IDs in the batch may be skipped if the server crashes mid-batch (gaps in the sequence). This is usually acceptable.

Real world

Instagram's ID system pre-allocates ranges. Flickr used MySQL auto-increment with batching for their ID service.

Capacity math

A Redis list handles millions of pops/second. Pre-generating 10K IDs every 10ms gives headroom for 1M ID requests/second.

In the real world: Instagram's ID system pre-allocates ranges. Flickr used MySQL auto-increment with batching for their ID service.

Clock Skew: A worker's system clock drifts backward by 50ms. The Snowflake algorithm panics: the same timestamp + sequence could collide with IDs already issued. Worker must refuse requests until the clock catches up, or wait for the skew to resolve.

§2Step 3 — Deep Dive

3Deep Dive

A load balancer sits between the API server and the ID generator worker pool, distributing requests with round-robin or least-connections routing and performing health checks on each worker.

Approach	Throughput	Sortable?	Coordination needed?	Best for	Cost	Ops burden
Snowflake (timestamp+worker+seq)	4K IDs/ms/node	Yes (time-ordered)	No (worker ID pre-assigned)	Twitter, Discord, Uber ✓	Low	Medium
UUID v4 (random)	Unlimited	No	No	Simple, global uniqueness, not sortable	Low	Low
UUID v7 (time-ordered)	Unlimited	Yes	No	Modern replacement for v4	Low	Low
Database auto-increment	DB throughput limited	Yes	Yes (central DB)	Single-node, small scale	Low	Low
ULID	Unlimited	Yes	No	URL-safe, Snowflake alternative	Low	Low

Distributed ID generation — Snowflake is the industry standard.

pythonSnowflake ID generator — 64-bit time-ordered distributed IDs

import time
import threading

# Snowflake layout (64 bits total):
# [1 sign][41 timestamp ms][10 worker ID][12 sequence]
EPOCH      = 1704067200000  # 2024-01-01 00:00:00 UTC in ms
WORKER_BITS = 10
SEQ_BITS    = 12
MAX_SEQ     = (1 << SEQ_BITS) - 1   # 4095

class SnowflakeGenerator:
    def __init__(self, worker_id: int):
        assert 0 <= worker_id < (1 << WORKER_BITS)
        self.worker_id = worker_id
        self.sequence  = 0
        self.last_ms   = -1
        self.lock      = threading.Lock()

    def next_id(self) -> int:
        with self.lock:
            now = int(time.time() * 1000) - EPOCH

            if now == self.last_ms:
                self.sequence = (self.sequence + 1) & MAX_SEQ
                if self.sequence == 0:
                    # Sequence exhausted — wait for next millisecond
                    while now <= self.last_ms:
                        now = int(time.time() * 1000) - EPOCH
            else:
                self.sequence = 0

            self.last_ms = now
            return (now << (WORKER_BITS + SEQ_BITS)) |                    (self.worker_id << SEQ_BITS) |                    self.sequence

gen = SnowflakeGenerator(worker_id=3)
print(gen.next_id())  # e.g. 7264823049182208003  (time-sortable)

Component	Why Add It	Tradeoff
Load Balancer	At scale, a single generator is a ceiling on throughput and a single point of failure.	Round-robin can't guarantee perfectly even distribution if workers have different response times.
ID Generator Workers	UUID v4 is random — you lose time-ordering, which destroys index locality on inserts (B-tree pages scatter randomly).	Snowflake IDs encode approximate creation time — an attacker can infer when an object was created from its ID.
Service Registry	Worker ID collision is the worst failure mode: two workers silently generating IDs with the same worker ID will produce colliding IDs that are impossible to detect until a uniqueness constraint fails in your database — potentially hours later.	ZooKeeper adds a startup dependency — workers can't start if the registry is unreachable.
Monitoring	Clock skew is silent.	Monitoring adds a small network overhead per metric emission.
Load Balancer	A single generator node can produce ~500K IDs/second.	Machine IDs must be unique and stable.
Cache for ID Batching	At peak, even fast ID generation has compute overhead.	IDs in the batch may be skipped if the server crashes mid-batch (gaps in the sequence).

Design decision tradeoffs

Clock Skew

A worker's system clock drifts backward by 50ms. The Snowflake algorithm panics: the same timestamp + sequence could collide with IDs already issued. Worker must refuse requests until the clock catches up, or wait for the skew to resolve.

Worker ID Collision

Two workers are assigned the same worker ID — every ID they generate will collide. Without a coordination service, this happens silently. With ZooKeeper, worker ID leases prevent this: no two workers can hold the same ID lease simultaneously.

Single Worker Failure

One of three worker nodes crashes mid-traffic. The load balancer detects the unhealthy instance via health checks and stops routing to it. The remaining two workers absorb the full 10K RPS — each has 4M IDs/sec headroom, so capacity is not a concern.

Sequence Overflow

A single worker receives 4,097 ID requests in the same millisecond. The 12-bit sequence field (max 4,096) overflows. Proper implementation must wait until the next millisecond tick before issuing the next ID — adding up to 1ms of latency for burst requests.

Epoch Exhaustion (Year 2159)

The 41-bit timestamp field stores milliseconds since a custom epoch (e.g. 2010-01-01). It exhausts in ~69 years. Twitter's Snowflake will overflow in 2079. Systems should document their epoch and plan an ID format migration well in advance — same problem as Y2K, but you get to see it coming.

A single ID generator is both a bottleneck and a SPOF. Snowflake workers are stateless in terms of coordination — each generates IDs independently using its pre-assigned worker ID. Scale by adding more workers.

Snowflake bit layout: [ 1 sign bit (always 0) | 41-bit timestamp (ms since epoch) | 10-bit worker ID | 12-bit sequence ]. Each worker can generate 4,096 unique IDs per millisecond — 4M/sec — without talking to any other node.

Worker IDs must be unique across the cluster. Use ZooKeeper or etcd: a worker claims an ID lease on startup and holds it until it gracefully shuts down. If it crashes, the lease expires via TTL and another worker can claim that ID.

Clock skew is the silent killer. Workers must monitor their local clock. If the clock goes backward, the worker should either (a) wait until the clock catches up or (b) reject requests and alert — never generate IDs with a past timestamp.

At 10K RPS across 3 workers, each handles ~3,333 RPS — using less than 0.1% of its 4M IDs/sec capacity. The bottleneck will never be ID generation itself; it will be network I/O or the load balancer.

§3Step 4 — Wrap Up

4Wrap Up

Decision	Choice	Why
Load Balancer	A load balancer sits between the API server and the ID generator worker pool, distributing requests with round-robin or least-connections routing and performing health checks on each worker.	At scale, a single generator is a ceiling on throughput and a single point of failure.
ID Generator Workers	Snowflake workers implement a 64-bit ID format: 1 sign bit (always 0) + 41-bit millisecond timestamp + 10-bit worker ID + 12-bit per-millisecond sequence.	UUID v4 is random — you lose time-ordering, which destroys index locality on inserts (B-tree pages scatter randomly).
Service Registry	A service registry (ZooKeeper, etcd, Consul) provides distributed coordination via leases.	Worker ID collision is the worst failure mode: two workers silently generating IDs with the same worker ID will produce colliding IDs that are impossible to detect until a uniqueness constraint fails in your database — potentially hours later.
Monitoring	Monitoring collects: (1) IDs generated per second per worker, (2) sequence overflow events (clock waiting), (3) system clock offset vs NTP, (4) worker ID assignment events (lease claims/releases), (5) per-worker p99 generation latency.	Clock skew is silent.
Load Balancer	A load balancer distributes ID generation requests across multiple generator nodes, each with a unique machine ID.	A single generator node can produce ~500K IDs/second.
Cache for ID Batching	A Redis cache holds pre-generated ID batches.	At peak, even fast ID generation has compute overhead.

Key design decisions

If the interviewer asks to scale 10×: From prototype to planet-scale. Introduce consistent hashing to redistribute load as you add nodes — minimize cache/shard remapping.

10× Target100K RPSwhere your architecture must hold

What's next