CAP Theorem & Consistency Trade-offs

Our coffee shop chain has now grown and have locations across the world. Each location has its own database to keep things fast for local customers. But here's the problem: when Alice updates her loyalty points in Delhi, how quickly should London and New York know about it?
If Alice flies to London and tries to redeem points before the sync happens, what should the system do? Show her the old balance? Refuse to serve her until we confirm with Delhi? These aren't just technical questions. They're business decisions disguised as engineering problems.
This is where the CAP theorem comes in. It describes the fundamental trade-offs every distributed system must make.
What You Will Learn in this lesson
- What the CAP theorem actually says (and what it doesn't)
- The three properties: Consistency, Availability, Partition Tolerance
- Why you must choose between CP and AP systems
- Consistency models: strong, eventual, and everything in between
- How to choose the right trade-off for your use case
- Real examples from companies like Amazon, Google, and Netflix
The CAP Theorem: The Triangle You Can't Complete
Let's first understand the three properties of the CAP theorem:
- Consistency (C): Every read must receive the most recent write or an error. It means every read operation that begins after a write must receive that write value.
- Availability (A): Every non-failing node returns a response for all read and write requests in a reasonable amount of time without the guarantee that it contains the most recent write.
- Partition Tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped by the network between nodes.
The CAP theorem says: when a network partition happens, you can only guarantee two of these three properties.
In practice, network partitions DO happen (servers crash, networks fail, cables get cut). So you're really choosing between:
- CP: Consistency + Partition Tolerance (sacrifice availability)
- AP: Availability + Partition Tolerance (sacrifice consistency)
Example
Let's say our coffee shop has two database servers in New York and London that replicate to each other.
Alice has 100 loyalty points. She's in NYC. Bob is checking her account from London.
Normal operation: Alice spends 50 points in NYC. Server A updates to 50 points, replicates to Server B. Bob in London sees 50 points. Everyone agrees.
Now the network breaks:
Server A has 50 points. Server B still thinks it's 100 points. They can't sync.
Now Bob in London tries to check Alice's balance. What happens?
CP vs AP: The Real Choice
CP System: Choose Consistency
A CP system says: We will NOT give an answer unless we're sure it's correct.
What happens during the partition:
- Bob asks Server B for Alice's balance
- Server B realizes it can't confirm with Server A
- Server B refuses: Sorry, I can't help you right now. Try again later.
Pros: Bob never sees wrong data.
Cons: The system is unavailable during partitions.
Real-world examples:
- Banking systems: You'd rather see "Service unavailable" than a wrong balance
- Inventory systems: Better to reject an order than oversell
AP System: Choose Availability
An AP system says: We will ALWAYS give an answer, even if it might be stale.
What happens during the partition:
- Bob asks Server B for Alice's balance
- Server B responds: 100 points (its last known value)
- This might be wrong, but Bob gets an answer
Pros: The system always responds.
Cons: Bob might see stale or incorrect data.
Real-world examples:
- Social media feeds: Showing a slightly stale feed is better than showing nothing
- DNS: Returns cached results even if authoritative server is down
- Cassandra, DynamoDB: Designed for availability
Consistency Models: A Spectrum
Consistency isn't binary. There's a spectrum from strong to weak:
Strong Consistency
Every read returns the most recent write. Period.
plaintextTimeline: T1: Alice writes balance = 50 T2: Write acknowledged T3: Bob guaranteed to see 50
Example: The moment you update the menu board of the coffee shop at one location, ALL locations instantly show the new menu. Impossible in the real world, expensive in distributed systems.
Cost: Slower writes (must wait for all replicas), unavailable during partitions.
Use when: Financial transactions, inventory counts, anything where stale data causes real harm.
Eventual Consistency
If you stop writing, eventually all reads will return the same value. But there's no guarantee how long eventually takes.
plaintextTimeline: T1: Alice writes "balance = 50" to Server A T2: Write acknowledged T3: Bob reads from Server B might see 100 (stale) T4: Bob reads from Server B might see 100 (still stale) T5: Replication catches up T6: Bob reads from Server B sees 50 (finally consistent)
Example: You update the menu at one location. Other locations will get the update soon but it might take a few minutes. Customers at other locations might order something that's no longer available.
Cost: Simpler, faster, more available. But applications must handle stale data.
Use when: Social media, analytics, caching, anything where slightly stale data is acceptable.
Read-Your-Writes Consistency
You always see your own writes, but might see stale data from others.
plaintextTimeline: T1: Alice writes "balance = 50" T2: Alice reads guaranteed to see 50 T3: Bob might see 50 or 100 (no guarantee)
Example: When you update your profile, you immediately see the change. But your friend looking at your profile might see the old version for a bit.
Use when: User profiles, preferences, documents you're editing.
Causal Consistency
If event A caused event B, everyone sees A before B.
plaintextTimeline: T1: Alice posts I'm engaged! T2: Bob comments Congratulations! T3: Carol sees post first, then comment (never sees comment without post)
Example: A reply to a message always appears after the original message. You never see answers before questions.
Use when: Social media, chat applications, collaborative editing.
Making the Right Choice
The Decision Framework
| Question | If Yes | If No |
|---|---|---|
| Can users tolerate stale data? | AP / Eventual | CP / Strong |
| Is data loss acceptable? | AP | CP |
| Must all users see the same thing? | Strong consistency | Eventual is fine |
| Is availability critical (revenue loss if down)? | AP | CP acceptable |
| Are there legal/compliance requirements? | Usually CP | Depends |
By Use Case
| Use Case | Recommended | Why |
|---|---|---|
| Bank account balance | CP + Strong | Wrong balance = legal/financial issues |
| Shopping cart | AP + Eventual | Losing a cart item is annoying, not catastrophic |
| Inventory count | CP + Strong | Overselling creates real problems |
| Social media likes | AP + Eventual | 10,003 vs 10,004 likes doesn't matter |
| Flight booking | CP + Strong | Double-booking a seat is a real problem |
| User preferences | AP + Read-your-writes | User sees their changes, others can lag |
| Chat messages | AP + Causal | Messages must appear in order |
| Session storage | AP + Eventual | Session loss = re-login, annoying but okay |
Patterns for Handling Inconsistency
When you choose eventual consistency, you need patterns to handle the inconsistency window:
Pattern 1: Conflict Resolution
When two writes conflict, how do you resolve it?
Last-Write-Wins (LWW):
plaintextServer A: balance = 50 (timestamp: 10:00:01) Server B: balance = 75 (timestamp: 10:00:02) Resolution: balance = 75 (later timestamp wins)
Simple but can lose data.
Application-Level Merge:
plaintextServer A: cart = ["coffee"] Server B: cart = ["tea"] Resolution: cart = ["coffee", "tea"] (merge both)
More complex but preserves both changes.
Pattern 2: Read Repair
When reading reveals inconsistency, fix it on the spot:
plaintextClient reads from 3 replicas: Replica 1: balance = 50 Replica 2: balance = 50 Replica 3: balance = 100 (stale!) System returns 50 AND updates Replica 3 to 50.
Pattern 3: Anti-Entropy
Background process that continuously compares replicas and fixes differences.
plaintextEvery 5 minutes: Compare Replica 1 hash with Replica 2 hash If different, sync the differing records
Pattern 4: Quorum Reads/Writes
Instead of requiring all replicas, require a majority:
plaintext3 replicas total Write to at least 2 (W=2) Read from at least 2 (R=2) W + R > N means reads always see latest write 2 + 2 > 3
This gives you tunable consistency. More replicas required = stronger consistency but higher latency.
Real-World Examples
Google Spanner: Global Strong Consistency
Spanner uses GPS clocks and atomic clocks to achieve global strong consistency with reasonable performance.
How: TrueTime API tells you the uncertainty in the current time. Transactions wait out the uncertainty before committing.
Trade-off: More expensive, higher latency than eventually consistent systems, but globally consistent.
Use case: Google's advertising system. When you set a budget, it must be consistently enforced across all data centers immediately.
Cassandra: Eventual Consistency by Default
Cassandra is designed for availability. You configure consistency per query:
sql-- Write to one replica (fast, less durable) INSERT INTO users (...) VALUES (...) USING CONSISTENCY ONE; -- Write to majority (slower, more durable) INSERT INTO users (...) VALUES (...) USING CONSISTENCY QUORUM; -- Write to all replicas (slowest, most durable) INSERT INTO users (...) VALUES (...) USING CONSISTENCY ALL;
Common Misconceptions
CAP means pick 2 of 3
Not quite. Network partitions WILL happen. You don't get to choose partition tolerance. The real choice is: during a partition, do you sacrifice consistency or availability?
When there's NO partition, you can have both consistency and availability.
Eventual consistency means inconsistent
Eventual consistency still converges to a consistent state. It just doesn't guarantee WHEN. For many use cases, within a few hundred milliseconds is eventual but practically indistinguishable from strong.
Strong consistency is always better
Strong consistency has real costs: higher latency, lower availability, more expensive infrastructure. If your use case tolerates eventual consistency, you're paying extra for nothing.
This doesn't apply to me, I use one database
Even a single database makes these trade-offs. PostgreSQL with synchronous replication is CP. With async replication, it's AP. Read replicas introduce eventual consistency.
Key Takeaways
The CAP theorem states that during a network partition, you must choose between consistency and availability.
CP systems refuse to answer if they can't guarantee correctness. Use for financial data, inventory, bookings.
AP systems always answer, even with potentially stale data. Use for social feeds, caching, analytics.
Consistency is a spectrum:
- Strong: all reads see latest write
- Eventual: reads converge over time
- Read-your-writes: you see your own changes
- Causal: cause appears before effect
Choose based on business impact:
- What's the cost of showing stale data?
- What's the cost of being unavailable?
Most systems mix consistency levels. Bank balance = strong. Notification count = eventual.
What's Next
We've been talking about synchronous communication between systems. But what happens when you need to decouple components? When Service A shouldn't wait for Service B? That's where Message Queues & Event Streaming come in. They're the backbone of scalable, resilient architectures.