Consistency Models
Strong vs eventual consistency, and the in-between guarantees that make apps feel correct.
A consistency model is the promise a system makes about when a write becomes visible to later reads. Once data is replicated across machines, regions, caches, search indexes, and read replicas, there is no single obvious answer. Stronger promises are easier for humans to reason about, but usually cost latency, availability, or throughput.
The problem: replicas create stale reads
Replication improves scale and availability, but updates take time to travel. During that window, different readers can see different values. The system is not necessarily broken; it is following whichever consistency model it promised.
timeline:
t0 user updates profile name from "Ava" to "Ava Chen" on primary
t1 primary commits and returns success
t2 user refreshes profile page
t3 read is served by a lagging replica that still has "Ava"
t4 replica catches up and later reads show "Ava Chen"
User experience:
"I saved the change, but the app showed the old value."
Missing guarantee:
read-your-writesIgnoring consistency models leads to surprising product bugs: a user sees an old profile after saving, a comment reply appears before the original comment, a shopping cart loses an item after failover, or a metrics dashboard goes backward in time.
The spectrum of consistency models
Consistency is not just strong vs eventual. It is a spectrum of guarantees. Stronger models make more anomalies impossible. Weaker models allow more freedom for replicas to answer locally.
| Model | Promise | Example user expectation | Typical cost |
|---|---|---|---|
| Strong / linearizable | After a write completes, every later read sees it, as if there is one copy | After buying the last ticket, nobody else can buy it | Coordination, higher write latency, less availability under partition |
| Sequential | Everyone sees operations in one shared order, but that order may not match real-time completion | All users agree on the order of chat operations, even if it is not wall-clock perfect | Global ordering without the strict real-time requirement |
| Causal | Cause-and-effect operations are seen in order; unrelated operations may differ | A reply never appears before the message it replies to | Track dependencies such as versions or vector clocks |
| Read-your-writes | A user sees their own completed writes on later reads | After editing my bio, I see the new bio when I refresh | Session stickiness, primary reads, or write-through cache |
| Monotonic reads | Once a user has seen a value, they do not later see an older value | A dashboard does not go from count 120 back to count 117 | Route users to replicas at least as fresh as the last one they read |
| Eventual | If writes stop, all replicas eventually converge | A like count or DNS update becomes correct after propagation | Handle stale reads and conflict resolution |
Strong or linearizable consistency is the easiest mental model: the distributed system behaves like one up-to-date machine. Eventual consistency is the weakest useful promise: replicas converge eventually, but reads in the meantime may be stale.
Concrete anomalies caused by lag
The best way to understand consistency models is to name the weird user experiences they prevent.
t0 replica A has inbox_count = 10
t1 replica B has inbox_count = 9 because it is behind
t2 user opens inbox, routed to A, sees 10
t3 user refreshes, routed to B, sees 9
The user appears to move backward in time.
Missing guarantee: monotonic readst0 Priya posts: "Database is down"
t1 Marco replies: "I restarted it"
t2 reply replicates to region EU before original post
t3 EU users see "I restarted it" with no parent message
Missing guarantee: causal consistencyt0 cart = []
t1 phone app adds "shoes" while offline
t2 laptop adds "socks" online
t3 replicas sync using last-write-wins
t4 final cart = ["shoes"] or ["socks"], but not both
The system converged, but the product result is wrong.
Fix: merge carts by item id, use CRDT/set semantics, or coordinate writes.Real systems choose different answers. DNS is famously eventually consistent because propagation delay is acceptable. ZooKeeper and etcd provide strong coordination because locks and leader election need a single truth. Social feeds often tolerate stale ordering, but direct messages usually need stronger read-your-writes behavior for the sender.
How systems implement stronger guarantees
Consistency models are implemented with coordination, routing, metadata, and conflict handling. The stronger the guarantee, the more the system must know before answering.
- Leader-based replication: send writes to a primary, then replicate to followers. Reads from the leader are fresher; follower reads are cheaper but may lag.
- Synchronous replication: wait for replicas to confirm before acknowledging a write. Stronger, but slower and less available when replicas are unreachable.
- Session stickiness: route a user back to a replica that has seen their previous writes or reads.
- Version metadata: attach timestamps, logical clocks, or vector clocks so replicas can detect ordering and conflicts.
- Application merges: resolve conflicts using product rules, such as merging shopping carts instead of picking one winner.
on write(user_id, value):
primary.commit(value)
session[user_id].min_version = primary.version
on read(user_id):
required = session[user_id].min_version
replica = choose_replica_with_version_at_least(required)
if no replica is fresh enough:
read_from_primary()
The user pays extra latency only when replicas are behind.Quorums: tuning consistency with R, W, and N
Some systems let you tune consistency using quorums. Store each item on N replicas. A write succeeds after W replicas acknowledge it. A read asks R replicas and chooses the newest version. If R + W > N, the read set and write set overlap on at least one replica, so a read can discover the latest acknowledged write.
N = 3 replicas: A, B, C
W = 2 replicas must accept a write
R = 2 replicas are read
Write x=7 reaches A and B.
Any read of 2 replicas must include at least one of A or B:
read A+B → sees 7
read A+C → sees 7
read B+C → sees 7
Because R + W = 4 > N = 3, the sets overlap.| Configuration | Behavior | Trade-off |
|---|---|---|
| R=1, W=1, N=3 | Fast reads and writes, but stale reads are possible | Low latency, weak consistency |
| R=2, W=2, N=3 | Read and write quorums overlap | Stronger reads, more coordination |
| R=1, W=3, N=3 | Writes wait for all replicas; reads are fast | Slow writes, fast reads |
| R=3, W=1, N=3 | Writes are fast; reads check all replicas | Fast writes, slow reads |
Cassandra and Dynamo-style systems popularized this tunable approach. It is powerful, but it is not magic. Clocks can disagree, replicas can be down, hinted handoff and read repair are asynchronous, and conflict resolution still matters. This connects directly to the trade-offs in the CAP theorem.
Edge cases and gotchas
- Strong reads from a replica may not be strong: if the replica lags, it can return old data unless the system checks freshness or routes to the leader.
- Last-write-wins can lose data: it converges, but it may discard a concurrent update that the product should have merged.
- Wall-clock timestamps are dangerous: clock skew can make an older write look newer. Logical clocks or server-assigned versions are safer for ordering.
- Indexes and search are replicas too: the database may be fresh while Elasticsearch, cache, or a materialized view is stale.
- Consistency can be per operation: a system might use strong consistency for username reservation and eventual consistency for follower counts.
- A consistency model is the visibility promise for writes in a replicated system.
- Strong/linearizable consistency gives one-copy behavior but usually costs coordination, latency, and partition availability.
- Sequential, causal, read-your-writes, and monotonic reads are useful middle guarantees between strong and eventual.
- Replication lag causes concrete anomalies such as stale reads, time-travel reads, replies before parents, and lost updates.
- Quorums tune the trade-off: R + W > N creates overlap, but conflict resolution and failure behavior still matter.
Mark it complete to track your progress through the workbook.