Every "Design X" question is a costume. Underneath, the same handful of problems repeat: fan-out, geospatial lookup, collaborative state, strong-consistency contention, search, streaming aggregation. Learn to see the pattern under the prompt and a question you've never seen becomes one you've already solved. This atlas gives you the patterns, the questions each one unlocks, the metrics to quote, and the exact sentences that make you sound like you've run these in production.
Before the patterns, the rhythm. Whatever the prompt, you walk these phases in this order. The pattern (next sections) just tells you what to say inside each phase. The phases never change — which is exactly why they're worth burning into muscle memory.
Functional + non-functional. Pin the scale number. State what's out of scope.
~5 minBack-of-envelope: QPS, storage, bandwidth. Reads:writes ratio drives everything.
~3 minDefine the interface and core entities. Pick the storage shape.
~4 minBoxes, arrows, data flow. Satisfy each functional requirement once.
~12 minAttack the bottleneck the pattern predicts. Trade-offs, two options, commit.
~15 minBottlenecks, failure modes, observability, rollout. Volunteer it.
~6 minThis is the picture to burn into memory. When a prompt lands, you're not searching 64 systems — you're walking one tree: from the root, pick the branch whose tell matches, and the leaf you were asked about is sitting on a branch you've already studied. The highlighted leaf on each branch is the worked chapter in this book — click it to open the full playbook.
For each cluster: the tell (how to recognize it), the questions it unlocks, the clarifying questions to ask, the core moves, the metrics to quote, a line that lands, and the trap to avoid. Color-coded so you can find your way back fast.
WHERE distance < r over all rows. It signals you've never indexed spatial data. Name geohash or S2 in the first sentence.LIKE '%term%' scan. It can't use an index and won't scale — reach for the inverted index or trie immediately.UPDATE counter SET n = n+1 row. That hot row is the whole problem — sharded counters or a stream aggregate is the expected answer.The 64-question canon, mapped to its dominant pattern and difficulty tier. In the interview you do the same map in your head: hear the prompt, name the pattern, run the method. This table is just that reflex, written down.
| Question | Tier | Dominant pattern | Crux in one line |
|---|---|---|---|
| TinyURL / Pastebin | T1 | Search / KV | ID encoding, read amplification, when to cache. |
| API / distributed rate limiter | T1 | Consistency | Atomic counter across nodes; token bucket; fail-open vs closed. |
| Unique ID generator | T1 | Consistency | Snowflake; clock skew; time-ordering vs uniqueness. |
| Typeahead / autocomplete | T1 | Search | Trie, precomputed top-K, <100ms budget. |
| API gateway | T1 | Consistency | Auth, routing, throttling at the edge. |
| Twitter / timeline | T2 | Fan-out | Push vs pull vs hybrid; the celebrity problem. |
| T2 | Fan-out + Media | Feed fan-out plus image/video storage tiers + CDN. | |
| Facebook Newsfeed | T2 | Fan-out (ranked) | Feature store + ML scoring over candidates. |
| T2 | Fan-out | Threaded comments + time-decay "hot" ranking. | |
| Messenger / WhatsApp | T2 | Realtime msg | Delivery semantics, ordering, offline inbox, E2E. |
| Dropbox | T2 | Blobs / Media | Chunking, dedup, delta sync, conflict resolution. |
| Yelp / Nearby Friends | T2 | Geospatial | Geohash / quadtree / S2 indexing. |
| Uber / Lyft | T2 | Geospatial + match | Live geo-index + real-time dispatch + surge. |
| Web crawler | T2 | Search | URL frontier, dedup (Bloom), politeness budget. |
| Twitter / Google Search | T2 | Search | Inverted index, sharding, ranking, freshness. |
| Ticketmaster | T2 | Consistency | No double-booking; distributed lock / hold TTL. |
| Google Calendar | T2 | Consistency | Recurring events, time zones, invite propagation. |
| Discord / Twitch chat | T3 | Realtime msg | Hierarchical fan-out; lossy delivery at scale. |
| Google Docs / Miro | T3 | Collaboration | OT vs CRDT; sub-100ms sync; convergence. |
| ChatGPT | T3 | Streaming + serving | GPU scheduling, KV-cache reuse, token streaming. |
| Notification system | T3 | Fan-out | Multi-channel, preferences, retries, provider failover. |
| Netflix recommendations | T3 | Streaming / ML | Candidate gen → features → serving → A/B. |
| Gmail | T3 | Search | Search-over-personal-data, threading, spam. |
| Google News aggregator | T3 | Streaming | Crawl → dedup → cluster → rank, continuously. |
| LeetCode judge | T3 | Async jobs | Sandboxing, isolation, queue dispatch, caching. |
| Code deployment | T3 | Async jobs | Blue-green, canary, rollback orchestration. |
| Metrics / Datadog | T3 | Streaming | TSDB, columnar storage, hierarchical aggregation. |
| LinkedIn / People You May Know | T3 | Fan-out / graph | Graph hops at billion-scale; offline precompute. |
| Airbnb | T3 | Consistency | Two-sided marketplace; availability + booking. |
| Reminder alert system | T3 | Scheduling | Timing wheels; billions of future tasks; crash safety. |
| YouTube / Netflix (video) | T4 | Media | Transcoding, adaptive bitrate, CDN economics. |
| Distributed cache (Redis) | T4 | Consistency / KV | Sharding, eviction, stampede, hot keys. |
| Key-value store (DynamoDB) | T4 | Consistency | Quorum R/W, vector clocks, hinted handoff. |
| Amazon S3 | T4 | Blobs | Erasure coding, multi-region, read-after-write. |
| Payment system / Stripe | T4 | Consistency | Idempotency, double-entry, reconciliation. |
| Flash sale | T4 | Consistency | Fairness + inventory under extreme contention. |
| Google Ads / click aggregator | T4 | Streaming | Real-time auction + exactly-once click counting. |
| Stock exchange | T4 | Consistency | Microsecond latency, deterministic matching. |
| YouTube likes counter | T4 | Streaming / counting | Sharded counters; no hot row. |
| Distributed lock / job scheduler / cron | T4 | Scheduling | Exactly-once, leader election, fencing tokens. |
| Dynamo / Cassandra / Kafka / Chubby / GFS / HDFS / BigTable | T5 | Foundational papers | Read as papers: consensus, logs, consistent hashing, chunked storage. |