The CAP Theorem (and PACELC)
Why a distributed system can't have it all when the network breaks — explained plainly.
The CAP theorem explains a hard choice in replicated distributed systems. When the network splits and replicas cannot talk to each other, the system must choose between Consistency and Availability. It cannot guarantee both for the same data at the same time during that partition.
The problem: replicas can lose contact
Replication is how systems scale reads, survive machine failures, and place data near users. But replicas communicate over networks, and networks sometimes drop, delay, duplicate, or reorder messages. A network partition is a failure where some nodes can still run but cannot communicate with other nodes.
Before partition:
client writes x = 2 to Node A
Node A replicates x = 2 to Node B
reads from A or B return 2
During partition:
Node A receives write x = 3
A cannot send the update to B
write x=3
client ─────────▶ Node A ✂ network partition ✂ Node B ─────────▶ client
x = 3 x = 2
Now a read on B cannot both be available and guaranteed latest.The failure mode of ignoring CAP is promising impossible behavior: "all replicas are always up, every request always succeeds, and every read always sees the latest write." In a partitioned system, that sentence contradicts itself.
The three properties
CAP uses precise meanings. In system design interviews, define the words before applying them.
| Property | Meaning in CAP | What users observe |
|---|---|---|
| Consistency (C) | Every read sees the latest successful write, as if there is one copy of the data | No stale reads; all replicas agree before answering |
| Availability (A) | Every request to a non-failing node receives a non-error response | The service keeps answering even if some nodes cannot coordinate |
| Partition tolerance (P) | The system continues operating despite lost or delayed messages between nodes | The design acknowledges that network splits happen |
Partition tolerance is not a feature toggle
In a real distributed system, you do not get to choose whether the network can fail. It can. That is why the common slogan "pick two of three" is misleading. The meaningful choice is what you do when P happens: favor C or favor A.
CP vs AP during a partition
During a partition, a replicated data system usually behaves like a CP system or an AP system for a given operation.
| Mode | Partition behavior | Good fit | Examples |
|---|---|---|---|
| CP | Preserve consistency by rejecting, blocking, or redirecting requests that cannot be safely coordinated | Locks, metadata, bank ledgers, inventory reservations, leader election | ZooKeeper, etcd, HBase, many strongly consistent SQL configurations |
| AP | Preserve availability by accepting local reads/writes and reconciling conflicts later | Feeds, likes, carts, presence, metrics, DNS-style data | Cassandra, Amazon Dynamo-style systems, Riak, many DynamoDB single-region eventually consistent reads |
partition occurs
write arrives at minority replica
replica cannot reach quorum/leader
replica returns error or timeout
Result:
availability is reduced
accepted writes remain consistentpartition occurs
write arrives at isolated replica
replica accepts write locally
later, partition heals
system reconciles versions using timestamps, vector clocks, CRDTs, or application logic
Result:
availability is preserved
readers may see stale or conflicting values temporarilyReal systems can mix choices. A shopping app might make payment authorization CP, product recommendations AP, and inventory reservation somewhere in between. CAP is not a label for the entire company; it is a way to reason about a data operation under partition.
The common misconception: do not say pick two always
The beginner version of CAP says you can pick any two of consistency, availability, and partition tolerance. That wording is memorable but wrong enough to cause bad designs.
- You cannot simply "pick CA" for a distributed system and ignore partitions. If nodes communicate over a network, partitions are part of reality.
- CAP does not say a CP system is always unavailable. It says availability may be sacrificed for affected operations during a partition.
- CAP does not say an AP system is always inconsistent. It may be perfectly consistent during normal operation and only allow divergence when coordination is impossible or too expensive.
- CAP is about a specific consistency guarantee. There are many weaker models such as causal, read-your-writes, monotonic reads, and eventual consistency. Those are covered in consistency models.
PACELC: Else, latency vs consistency
CAP focuses on partitions, but partitions are rare compared with normal operation. PACELC extends the reasoning: if there is a Partition, choose Availability or Consistency; Else, when the network is healthy, choose Latency or Consistency.
P A / E L = if Partition, choose Availability; Else choose Latency
P C / E C = if Partition, choose Consistency; Else choose Consistency
The first half is the failure choice.
The second half is the everyday performance choice.| PACELC style | Normal operation | Partition behavior | Typical shape |
|---|---|---|---|
| PA/EL | Favor low latency reads/writes | Keep serving if replicas split | Dynamo-style, Cassandra-style workloads |
| PC/EC | Coordinate for stronger consistency | Block unsafe operations | ZooKeeper/etcd-style coordination, strongly consistent databases |
| PC/EL | Low latency when healthy, consistency during partition | May use local fast paths but require quorum under failure | Some tunable quorum systems by configuration |
The PACELC lens is often more useful than CAP in day-to-day design. A globally replicated database can keep replicas consistent across continents, but every write may wait for cross-region coordination. That may be correct for a financial ledger and unacceptable for a social reaction counter.
Edge cases and gotchas
- Partial partitions: not every node is split from every other node. Some links fail while others work, creating asymmetric behavior that is hard to test.
- Timeouts are ambiguous: a timeout does not prove the other node failed. It may be slow, overloaded, or partitioned.
- Conflict resolution is product logic: last-write-wins may be fine for a display name but dangerous for a bank balance.
- Client retries can amplify conflicts: an AP write that times out may have succeeded locally. Retrying without idempotency can create duplicates.
- CAP is not capacity planning: it describes a correctness trade-off under network failure, not CPU, memory, or disk throughput.
- CAP says that during a network partition, a replicated system must choose consistency or availability for affected operations.
- Partition tolerance is not optional in real distributed systems; the practical choice is CP vs AP when coordination breaks.
- CP systems such as ZooKeeper, etcd, and HBase reject or block unsafe operations to preserve correctness.
- AP systems such as Cassandra and Dynamo-style stores keep serving and reconcile stale or conflicting data later.
- PACELC adds the normal-case trade-off: even without partitions, stronger consistency often costs latency.
Mark it complete to track your progress through the workbook.