Read Replicas & CQRS

Most systems read far more than they write — ratios of 10:1, 100:1, even 1000:1 are common. A user posts a photo once and it’s viewed a million times; a product is edited rarely and browsed constantly. This asymmetry is a gift, because reads are easy to scale by copying. A write must go to one authoritative place, but a read can be served from any copy that’s good enough. Read replicas exploit exactly this, and CQRS pushes the idea to its logical conclusion.

Read replicas: copies that serve reads

A read replica is a continuously updated copy of your database that handles read queries while the primary handles all writes. The primary streams its changes to the replicas (see Replication); the application sends writes to the primary and spreads reads across the replicas.

   writes ─────────────► [ PRIMARY ] ──┬─► [ replica A ] ─┐
                                        ├─► [ replica B ] ─┼─► reads
                                        └─► [ replica C ] ─┘

What does this buy us, and what does it cost? It buys read throughput that scales with the number of replicas, plus read availability (lose a replica, route around it) and a natural place to run heavy analytics without slowing the primary. It costs you two things that turn out to matter a lot:

The single writer is untouched. Replicas scale reads, not writes. If your write load is the wall, replicas do nothing — you’ll need partitioning or sharding (see Database Scaling Patterns).
Replication lag, the consistency cost, which deserves its own section.

Replication lag and read-your-writes

Replication is asynchronous by default: the primary acknowledges a write and then the change flows to the replicas, arriving milliseconds to seconds later. In that window, a replica is stale — it doesn’t yet know about the latest write. Usually that’s fine. Sometimes it produces a baffling, trust-destroying bug:

   t0  user updates their profile name ───► PRIMARY  (ack: "saved!")
   t1  page reloads, reads from ──────────► REPLICA  (lag: still old name)
   →  user sees their OLD name and thinks the save failed

This is the read-your-writes problem: a user must always see the effects of their own writes immediately, even if other users can tolerate slight staleness. The asymmetry is the key insight — your writes need consistency; everyone else’s don’t, for the same data.

CQRS: separate the read model from the write model

Read replicas give every reader the same shape of data as the writer — same tables, same schema, just a copy. CQRS (Command Query Responsibility Segregation) goes further: it says the optimal shape for writing is often not the optimal shape for reading, so use two different models.

The write side (commands) is normalized, validated, transactional — optimized for correctness.
The read side (queries) is denormalized, pre-joined, pre-aggregated — optimized for fast reads. It’s often a different store entirely (a search index, a document store, a materialized view), kept in sync from the write side via events or replication.

   commands ─► [ WRITE MODEL ]  (normalized, source of truth)
                     │  (events / change stream)
                     ▼
               [ READ MODEL(S) ] ─► queries  (denormalized, fast, per-use-case)

What does this buy us, and what does it cost? It buys read performance you simply can’t get from a single shared schema: each read model is shaped for exactly one query pattern, so no expensive joins or aggregations at read time. You can have several read models, each tuned for a different screen. It costs you a great deal:

Eventual consistency between the models — the read side lags the write side, the same read-your-writes problem as replicas but more pronounced.
Operational and conceptual complexity — two models to keep in sync, a pipeline between them, more moving parts to monitor and debug.
More code — you maintain the projection logic that turns writes into read models.

Where on the ladder these sit

Read replicas are an early, cheap rung: they slot in once a cache isn’t enough and reads dominate. CQRS is a later, deliberate move for systems whose read and write workloads have diverged so much that one schema serving both has become the bottleneck. Both are answers to the read side of scaling; neither helps the write side, which is the subject of Database Scaling Patterns.

Check your understanding

Why do read replicas scale reads but do nothing for a write-bound system?
Explain the read-your-writes problem and why the fix targets only the user’s own data rather than making everything consistent.
Why is it dangerous to design as if replication lag were a small fixed number?
How does CQRS differ from “just adding read replicas”? What does it add that replicas don’t?
Give a concrete signal that a system actually needs CQRS rather than replicas plus a cache.