Part 9 · Case Studies — Designing Real Systems

The previous eight parts built your vocabulary: caching, replication, sharding, queues, consensus, rate limiting. This part puts the vocabulary to work. Each page here is a case study — a full “design X” walkthrough — and they all follow the same skeleton. The skeleton matters more than any single answer, because a system design interview (or a real design doc) is not a trivia quiz. It is a test of whether you can take a vague prompt and drive it toward a defensible architecture under constraints.

The framework: six moves, in order

The single biggest mistake is jumping straight to boxes-and-arrows. Resist it. Walk these six steps in order, out loud, every time.

1. CLARIFY      What are we actually building? (functional + non-functional)
2. ESTIMATE     How big is it? (QPS, storage, bandwidth — back-of-envelope)
3. API          What are the endpoints? (the contract clients depend on)
4. DATA MODEL   What do we store, and how is it shaped/keyed?
5. ARCHITECTURE High-level boxes and arrows (the happy path)
6. DEEP DIVE    Find the bottleneck, scale it, name the trade-offs
   └─ thread throughout: what does each choice BUY us, and what does it COST?

1. Clarify requirements

Split requirements into two buckets. Functional requirements are what the system does — “shorten a URL,” “deliver a message,” “return autocomplete suggestions.” Non-functional requirements are the qualities it must have while doing it — latency, availability, consistency, durability, scale. Non-functional requirements are where the design actually lives: “a chat app” tells you almost nothing, but “100M users, sub-200ms delivery, messages must never silently disappear” tells you almost everything.

2. Back-of-envelope estimation

Turn the scale into numbers you can design against. You need three:

QPS — daily active users × actions per user ÷ 86,400 seconds, then multiply by a peak factor (typically 2–3×). Distinguish read QPS from write QPS.
Storage — bytes per record × records per day × retention. Project to years.
Bandwidth — QPS × payload size.

You are not chasing precision; you are chasing the order of magnitude that tells you whether one box suffices or you need a sharded fleet. See Back-of-Envelope Estimation and the latency numbers every engineer should know.

3. API sketch

A handful of endpoints — method, path, key params, return shape. This forces you to name the core operations and exposes hidden requirements (pagination, auth, idempotency keys). Keep it small; three to five endpoints is plenty.

4. Data model

What entities exist, what fields they carry, and — crucially — what you key and index on. The access pattern dictates the model, not the other way around. Decide SQL vs NoSQL here, and anticipate your partition key before you need it.

5. High-level design

Now draw the boxes: clients → load balancer → stateless app tier → caches → databases → async workers. Show the happy path of one request first. Lean on the building blocks you already know: load balancers, caching, CDNs, replication.

6. Deep dive and trade-offs

Pick the part that breaks first under your estimated load and fix it: hot keys, the celebrity fan-out, connection limits, write amplification. Then state the trade-offs explicitly. This is the move that separates a senior answer from a junior one — every choice you made bought something (latency, simplicity, scale) and cost something (consistency, money, operational burden). Saying so out loud proves you understand the design rather than reciting it.

The roadmap: eight case studies

This part contains eight designs. The four detailed in depth here, plus four companions in this same directory. Each reuses the framework above; together they cover the major archetypes you’ll meet.

Case study	Archetype it teaches
Design a URL Shortener	Read-heavy KV store, unique ID generation, caching
Design a News Feed	Fan-out, the celebrity problem, ranking
Design a Chat System	Stateful connections, ordering, real-time delivery
Design a Rate Limiter	Distributed counters, algorithm trade-offs
Design a Notification System	Multi-channel fan-out, queues, retries
Design a Typeahead Autocomplete	Tries, prefix search, latency budgets
Design a Web Crawler	Frontier queues, politeness, dedup at scale
Design a Payment System	Idempotency, exactly-once, consistency & audit

Read the four deep dives first. They establish patterns — caching the hot path, fanning out work, holding stateful connections, counting under contention — that the companion four recombine.

The thread

What does this buy us, and what does it cost? Carry that question through every page. A framework is only useful if it makes the trade-offs visible: estimation buys you the right to choose, the API buys you a contract, the data model buys you predictable access, and the deep dive is where you pay — in consistency, in money, in complexity — for the scale you asked for. Master the six moves and any “design X” prompt becomes the same problem wearing a different hat.

Check your understanding

Name the six moves of the framework in order. Why is jumping straight to the architecture diagram a mistake?
What is the difference between functional and non-functional requirements, and why do the non-functional ones do most of the design work?
Why is the read:write ratio the first number to establish?
How do you turn “100M daily active users” into a peak write-QPS figure?
Give an example of a design choice and state explicitly what it buys and what it costs.