Open on desktop
Antimetal's interactive diagrams require a larger screen. Open this page on your laptop or desktop to continue.
Ride-Sharing Backend
§1Step 2 — High-Level Design
Match riders and drivers in real-time. Geospatial indexing, supply/demand, and driver location streaming.
Interactive diagram locked
Upgrade to Pro to build and run this system.
Interactive diagram locked
Upgrade to Pro to build and run this system.
Place an API Gateway to handle both driver and passenger app traffic, routing to the appropriate backend services.
The API Gateway routes traffic from driver and passenger mobile apps to the appropriate backend services, handling authentication and protocol routing.
Driver and passenger apps make different types of requests: location updates (high frequency, small payload), ride requests (low frequency, needs matching), and ride status (polling). The gateway routes each type to the optimized backend.
The gateway becomes critical infrastructure — all app traffic passes through it. Run 3+ instances in multiple AZs. A 100ms gateway outage means 0 rides being matched.
Uber uses Envoy as their API gateway. Lyft uses Kong. Both handle millions of requests/second from driver and passenger apps globally.
At peak, 1M active drivers sending location updates every 5 seconds = 200K requests/second to the location service. A 3-node API gateway cluster handles this.
Add Redis with geospatial commands to store real-time driver locations for fast proximity matching.
Redis stores real-time driver locations using geospatial commands and enables proximity queries to find available drivers near a passenger.
Finding the nearest available driver requires querying thousands of moving GPS coordinates. Redis GEORADIUS returns all drivers within 5km radius in O(N+log M) time — under 5ms for 100K drivers.
Redis geospatial uses WGS84 coordinates — accurate enough for proximity matching. Positions are stored as 52-bit integers (~ 0.6mm precision). Drivers update every 5 seconds so location is stale by up to 5 seconds.
Uber uses Redis geospatial (H3 indexed) for dispatch. Lyft uses Redis for real-time driver location. DoorDash uses Redis for dasher proximity matching. All major gig economy apps use this pattern.
1M active drivers × 20 bytes per location = 20MB Redis. GEOADD (location update): 200K/second from 1M drivers updating every 5s. GEORADIUS queries: ~10K/second from passenger app requests.
Add Postgres to store trip records, user profiles, payment history, and driver/passenger state.
Postgres stores the durable state of all rides, user accounts, driver profiles, and payment records — the business data that must survive beyond in-memory caches.
Redis holds live driver locations (ephemeral, high-frequency). Postgres holds trip history (durable, lower-frequency). A completed trip must be durably stored for billing, disputes, and compliance.
Postgres write throughput limits surge capacity. At 10K trip starts/second (massive surge), Postgres must handle 10K INSERT/second — feasible but requires connection pooling (PgBouncer) and SSD-backed storage.
Uber uses MySQL for trip data. Lyft uses PostgreSQL. Both shard by driver_id or city for horizontal scaling. Trip records are immutable once completed — append-only for simplicity.
10M trips/day × 1KB per trip record = 10GB/day. Partition by date; after 90 days, archive to Redshift for analytics. Active trip state (< 1M concurrent) fits in 1GB.
Add matching engine workers that run the driver-passenger matching algorithm in the background.
Matching workers run optimization algorithms that pair ride requests with the optimal available driver based on proximity, ETA, driver rating, and surge pricing.
Matching is computationally intensive — comparing N passengers against M drivers with multi-factor optimization. Running this synchronously in the request path adds 500ms-2s latency. Workers run it asynchronously every few seconds.
Async matching introduces delay (1-5 seconds from request to match). Synchronous matching is faster but blocks the API under load. Uber's dispatch runs in < 2 seconds including matching.
Uber's dispatch system uses a proprietary matching algorithm running on dedicated worker fleets. Lyft uses Python workers. Both balance match quality vs. match latency.
At 100K active ride requests and 1M available drivers, the matching algorithm runs graph optimization. Uber's matching worker handles one city per worker — 100+ cities = 100+ workers.
§2Step 3 — Deep Dive
The API Gateway routes traffic from driver and passenger mobile apps to the appropriate backend services, handling authentication and protocol routing.
| Approach | Location update | Nearby search | Surge support | Best for | Cost | Ops burden |
|---|---|---|---|---|---|---|
| Redis GEO (Geohash) | O(log n) per update | O(n+k) GEORADIUS | Aggregate by area | Driver matching ✓ | Medium | Low |
| H3 Hexagons (Uber) | O(1) cell lookup | O(1) neighbor cells | Native (cell aggregation) | Surge pricing, heatmaps | Low | Medium |
| PostGIS | O(log n) | O(log n) ST_DWithin | Yes, with aggregations | Complex geo queries | Medium | Medium |
| S2 Cells (Google) | O(1) cell lookup | O(1) parent cells | Yes | Google Maps, routing | Low | Medium |
| QuadTree (custom) | O(log n) | O(log n) | Yes | Game engines, custom | Low | Medium |
Geospatial matching for ride-sharing — H3 hexagons and Redis GEO are the standard.
import redis, json
from typing import Optional
r = redis.Redis(decode_responses=True)
DRIVER_TTL = 30
SEARCH_KM = 5
def update_driver_location(driver_id: str, city: str,
lat: float, lon: float, available: bool = True):
if available:
r.geoadd(f"drivers:{city}:available", (lon, lat, driver_id))
else:
r.zrem(f"drivers:{city}:available", driver_id)
r.setex(f"driver:{driver_id}", DRIVER_TTL, json.dumps({'lat': lat, 'lon': lon}))
def find_nearest_driver(city: str, lat: float, lon: float) -> Optional[str]:
results = r.georadius(
f"drivers:{city}:available",
lon, lat, SEARCH_KM, unit='km',
withdist=True, count=1, sort='ASC'
)
if not results:
return None
driver_id, _ = results[0]
if not r.exists(f"driver:{driver_id}"):
r.zrem(f"drivers:{city}:available", driver_id)
return find_nearest_driver(city, lat, lon)
return driver_id| Component | Why Add It | Tradeoff |
|---|---|---|
| API Gateway | Driver and passenger apps make different types of requests: location updates (high frequency, small payload), ride requests (low frequency, needs matching), and ride status (polling). | The gateway becomes critical infrastructure — all app traffic passes through it. |
| Redis for Driver Location | Finding the nearest available driver requires querying thousands of moving GPS coordinates. | Redis geospatial uses WGS84 coordinates — accurate enough for proximity matching. |
| Postgres for Trips and Users | Redis holds live driver locations (ephemeral, high-frequency). | Postgres write throughput limits surge capacity. |
| Worker Services for Matching | Matching is computationally intensive — comparing N passengers against M drivers with multi-factor optimization. | Async matching introduces delay (1-5 seconds from request to match). |
Design decision tradeoffs
lb-1 crashes. All driver and passenger apps lose the gateway. How do you implement DNS-based failover, multi-AZ load balancers, and client retry logic to reconnect within 10 seconds?
During 6 PM rush hour, 100K drivers and riders concentrate in a 1km2 downtown area, overwhelming the geo index for that region. GEOADD and GEORADIUS ops queue up. How do you shard the geo index by geographic cell and route hot cells to dedicated Redis instances?
API servers lose network access to cache-1 (Redis geo). Drivers send location updates but they can't be indexed. Riders can't find nearby drivers. How do you implement local fallback (use stale locations), reconnection logic, and graceful degradation to 'no drivers nearby' instead of errors?
§3Step 4 — Wrap Up
| Decision | Choice | Why |
|---|---|---|
| API Gateway | The API Gateway routes traffic from driver and passenger mobile apps to the appropriate backend services, handling authentication and protocol routing. | Driver and passenger apps make different types of requests: location updates (high frequency, small payload), ride requests (low frequency, needs matching), and ride status (polling). |
| Redis for Driver Location | Redis stores real-time driver locations using geospatial commands and enables proximity queries to find available drivers near a passenger. | Finding the nearest available driver requires querying thousands of moving GPS coordinates. |
| Postgres for Trips and Users | Postgres stores the durable state of all rides, user accounts, driver profiles, and payment records — the business data that must survive beyond in-memory caches. | Redis holds live driver locations (ephemeral, high-frequency). |
| Worker Services for Matching | Matching workers run optimization algorithms that pair ride requests with the optimal available driver based on proximity, ETA, driver rating, and surge pricing. | Matching is computationally intensive — comparing N passengers against M drivers with multi-factor optimization. |
Key design decisions