Skip to content

SQL vs NoSQL

This is the most over-argued decision in our field, and most of the argument is noise. “SQL is legacy.” “NoSQL doesn’t scale your team.” Both camps are defending a tool as if it were a tribe. Strip the tribalism away and you find a genuine engineering fork — but the fork is not “relational versus not.” It’s how do you read and write your data, and how much do you need the database to guarantee correctness for you?

What does each side buy us, and what does it cost? That’s the only question that matters.

The relational model (Postgres, MySQL, SQL Server, and friends) is built on four pillars that have survived fifty years for good reason:

  • A schema. Every row in a table has the same shape, declared up front and enforced on write. A column is an INTEGER or it’s rejected. This is a constraint, and constraints are a feature: the database refuses to store nonsense.
  • Joins. Data is split into normalized tables (no duplication), and you reassemble it at read time with relational joins. You store each fact once and combine facts on demand.
  • ACID transactions. A group of changes either all happen or none do, and concurrent users don’t trip over each other (see Transactions & ACID).
  • A declarative query language. You say what you want; the query planner figures out how. Add an index years later and old queries get faster with zero code changes.

The cost: the schema is rigid (changing it on a billion-row table is a real operation), and the classic relational model was designed to scale up (a bigger box) more naturally than out (many boxes). Joins and strong transactions are hard to keep cheap once data is spread across machines — which is exactly the wall that birthed NoSQL.

“NoSQL” is an unfortunate umbrella covering four quite different shapes:

Document {nested JSON per record} MongoDB, Couchbase
Key-Value key → opaque blob Redis, DynamoDB, Riak
Wide-Column rows with flexible columns Cassandra, HBase, Bigtable
Graph nodes + edges Neo4j, Neptune

What most of them trade for is the same thing: they relax the relational guarantees in exchange for flexibility and horizontal scale.

  • Flexible / no enforced schema. Each document can have different fields. Great when your data is genuinely heterogeneous or evolving fast; the responsibility for shape just moves into your application code instead of the database.
  • Built to scale out. Many were designed from day one to be partitioned across hundreds of nodes, often with tunable, weaker consistency so that no single coordinator becomes a bottleneck.
  • Access-pattern-shaped storage. A document store lets you keep an entire aggregate (an order plus its line items) as one record, so a common read is a single fast lookup — no join.

The cost is the mirror image of SQL’s gifts: you often denormalize (store data multiple times), you give up cross-record transactions and rich ad-hoc joins, and you must know your queries before you model your data, because changing access patterns later can mean re-storing everything.

The real axis: access patterns + consistency

Section titled “The real axis: access patterns + consistency”

Here is the reframe that cuts through the dogma. Don’t ask “which technology.” Ask two questions:

  1. What are my access patterns? Do I read whole self-contained objects by key (document/KV shines), or do I slice and recombine data along many dimensions with queries I can’t predict (relational shines)? Are my relationships the point — friend-of-a-friend, fraud rings — (graph shines)?
  2. How much consistency do I need? Must a transfer debit one account and credit another atomically, or never (strong, relational — see Consensus)? Or is “the like count is right within a second or two” perfectly fine (eventual consistency, common in NoSQL)?
SituationLean toward
Money, inventory, anything that must balanceSQL (strong ACID)
Unpredictable, ad-hoc queries and reportingSQL (joins + planner)
Massive write volume, simple key lookupsNoSQL (KV / wide-column)
Self-contained documents read as a unitNoSQL (document)
Relationship traversal is the core featureNoSQL (graph)
Strict, slowly-changing record shapeSQL (enforced schema)
Rapidly evolving or heterogeneous recordsNoSQL (flexible schema)

A database’s job is to store your data and hand it back the way you need it. SQL hands you strong guarantees and query flexibility at the price of rigidity and harder horizontal scale; NoSQL hands you flexibility and out-the-box scale at the price of those guarantees and the burden of knowing your queries upfront. The decision isn’t ideological — it’s a clear-eyed match between how your application reads and writes and how much correctness you need the database to enforce for you. For the wider menu of engines this fork sits inside, see the Databases field guide.

  1. Name the four pillars of the relational model and the cost each one carries.
  2. Why is “NoSQL” a misleading category? List the four sub-types and what each is good at.
  3. Restate the SQL-vs-NoSQL decision as a question about access patterns and consistency.
  4. Explain schema-on-write vs schema-on-read. Why is “there is always a schema” true even for a schemaless database?
  5. Why is a single relational database often the right first choice, and what signal tells you it’s time to add or switch to NoSQL?