Open on desktop
Antimetal's interactive diagrams require a larger screen. Open this page on your laptop or desktop to continue.
Social Media Feed
§1Step 2 — High-Level Design
Build a news feed like Twitter/X. Fan-out on write vs read. Handle celebrities with 100M followers.
Store each user's feed as a Redis sorted set (score = timestamp). Feed reads are a single ZREVRANGE call — no database join required.
Each user has a Redis sorted set of post IDs (scored by timestamp). Feed read = ZREVRANGE of their timeline key. Fan-out write = ZADD to each follower's timeline.
Building a feed from the database requires expensive JOINs across posts, follows, and ranking tables. Pre-computed Redis timelines serve feeds at < 1ms.
Fan-out-on-write has write amplification: 1 post by someone with 1M followers = 1M Redis writes. This is fine for normal users but catastrophic for celebrities.
Twitter's original architecture used Redis sorted sets for pre-computed timelines. Instagram uses Cassandra for timeline storage at 60M reads/day.
1 timeline entry = ~40 bytes (post ID + timestamp + metadata). 100 posts per user × 100M users = 400GB Redis storage. Shard by user ID across 8-10 Redis nodes.
Post creation publishes to Kafka. Fan-out worker service consumes events and writes to follower timelines asynchronously — decoupling post latency from follower count.
Write API publishes PostCreated to Kafka. Fan-out workers consume the event and write to up to N follower timelines in Redis. Workers scale independently of the write API.
Synchronous fan-out makes post latency proportional to follower count. A celeb with 10M followers would cause a 10M-write synchronous operation — unacceptable.
Async fan-out means followers see posts with a short delay (seconds). Most users accept < 10s delivery lag in exchange for consistent low-latency post creation.
Twitter switched to async fan-out via a delivery pipeline. Facebook uses Iris (a real-time notification system) built on Kafka for async feed updates.
Kafka fan-out at 1M posts/hour × average 500 followers = 500M timeline writes/hour. With 100 fan-out worker partitions = 5M writes/hour per worker — very manageable.
Store post images and videos in object storage. Posts in Postgres reference media by URL — actual bytes served via CDN without touching your API servers.
Photos and videos are uploaded directly to S3 using pre-signed URLs. Post metadata (caption, user, timestamp, media URLs) is stored in Postgres. CDN serves media at the edge.
At 100M DAU posting 1 photo each, that's 100M images/day. Routing uploads through API servers saturates network bandwidth. Pre-signed URLs bypass the API entirely.
Pre-signed URLs require an expiry time. For private content, short TTLs (15 minutes) add re-authentication overhead. Public content can use permanent CDN URLs.
Instagram stores all media in S3, serving through a custom CDN with 99%+ cache hit rates. TikTok uses distributed object storage across multiple clouds for 500M daily video uploads.
S3 scales to exabytes. At 100M images/day × 2MB average = 200TB/day of new storage. CDN cache hit rates of 95%+ mean origin storage serves < 5% of reads.
§2Step 3 — Deep Dive
Each user has a Redis sorted set of post IDs (scored by timestamp). Feed read = ZREVRANGE of their timeline key. Fan-out write = ZADD to each follower's timeline.
| Strategy | Write cost | Read cost | Celebrity problem | Best for | Cost | Ops burden |
|---|---|---|---|---|---|---|
| Fan-out on write | O(followers) | O(1) Redis read | Write amplification | < 10K followers/user ✓ | Medium | Medium |
| Fan-out on read | O(1) | O(following) joins | None | Celebrity-heavy platforms | Low | Low |
| Hybrid (write normal, read celeb) | O(normal followers) | O(celeb following) | Solved | Twitter/Instagram scale ✓ | Medium | High |
| Ranked feed service | O(followers) | O(1) with ML rank | Write amplification | Personalized ranking | High | High |
| Pull-based aggregation | O(1) | O(following) | None | Low-frequency, small scale | Low | Medium |
Feed generation strategies — hybrid wins for celebrity-heavy platforms.
import { createClient } from 'redis'
import { Kafka } from 'kafkajs'
const redis = createClient()
const kafka = new Kafka({ brokers: ['kafka:9092'] })
const CELEBRITY_THRESHOLD = 1_000_000 // 1M followers = celebrity
export async function getFeed(userId: string, limit = 20): Promise<string[]> {
const timelineKey = `timeline:${userId}`
// 1. Read pre-computed timeline from Redis sorted set (newest first)
const postIds = await redis.zRange(timelineKey, 0, limit - 1, { REV: true })
// 2. For users followed celebrities, merge their recent posts at read time
const following = await getCelebrityFollows(userId)
if (following.length > 0) {
const celebPosts = await fetchRecentPosts(following, limit)
return mergeSortedByTime([...postIds, ...celebPosts]).slice(0, limit)
}
return postIds
}
// Fan-out worker: consumes PostCreated events from Kafka
export async function fanOutPost(authorId: string, postId: string, timestamp: number): Promise<void> {
const followerCount = await getFollowerCount(authorId)
// Skip fan-out for celebrities — their posts are fetched at read time
if (followerCount >= CELEBRITY_THRESHOLD) return
const followers = await getFollowers(authorId)
const pipeline = redis.multi()
for (const followerId of followers) {
const key = `timeline:${followerId}`
pipeline.zAdd(key, { score: timestamp, value: postId })
// Keep only last 1000 posts per timeline
pipeline.zRemRangeByRank(key, 0, -1001)
}
await pipeline.exec()
}
declare function getCelebrityFollows(userId: string): Promise<string[]>
declare function fetchRecentPosts(userIds: string[], limit: number): Promise<string[]>
declare function mergeSortedByTime(postIds: string[]): string[]
declare function getFollowerCount(userId: string): Promise<number>
declare function getFollowers(userId: string): Promise<string[]>| Component | Why Add It | Tradeoff |
|---|---|---|
| Redis for Pre-Computed Timelines | Building a feed from the database requires expensive JOINs across posts, follows, and ranking tables. | Fan-out-on-write has write amplification: 1 post by someone with 1M followers = 1M Redis writes. |
| Kafka for Async Fan-Out Workers | Synchronous fan-out makes post latency proportional to follower count. | Async fan-out means followers see posts with a short delay (seconds). |
| Object Storage for Media | At 100M DAU posting 1 photo each, that's 100M images/day. | Pre-signed URLs require an expiry time. |
Design decision tradeoffs
Postgres goes down. Users with pre-computed Redis timelines can still read feeds. What percentage of users are affected?
api-1 crashes. Users connected to it see feed loading errors. How do you implement health checks, session affinity, and graceful shutdown so existing connections drain and new requests route to api-2?
A celebrity with 50M followers posts a photo. The fan-out service must write to 50M follower feeds. This saturates the write pipeline and delays other users' feeds by minutes. How do you implement lazy fan-out, hybrid push/pull, and priority queuing for celebrity accounts?
§3Step 4 — Wrap Up
| Decision | Choice | Why |
|---|---|---|
| Redis for Pre-Computed Timelines | Each user has a Redis sorted set of post IDs (scored by timestamp). | Building a feed from the database requires expensive JOINs across posts, follows, and ranking tables. |
| Kafka for Async Fan-Out Workers | Write API publishes PostCreated to Kafka. | Synchronous fan-out makes post latency proportional to follower count. |
| Object Storage for Media | Photos and videos are uploaded directly to S3 using pre-signed URLs. | At 100M DAU posting 1 photo each, that's 100M images/day. |
Key design decisions