A distributed system can provide at most 2 of 3 guarantees: Consistency, Availability, and Partition tolerance. Since partitions are inevitable, the real choice is CP vs AP โ per operation, not per system.
Medium Very High FrequencyThe CAP theorem (Brewer, 2000) states that when a network partition occurs, a distributed system must choose between consistency and availability. In practice, consistency and availability exist on a spectrum โ it's not a binary toggle.
CAP only covers the partition case. PACELC extends it: during a Partition, choose A or C; Else (normal operation), choose Latency or Consistency. This better reflects real-world tradeoffs: even without partitions, strong consistency costs latency.
Payments (CP) vs Timeline (AP): The real choice isn't "CP or AP for the whole system" โ it's per operation. Payments = CP with Raft consensus (wrong balance is catastrophic). Activity feed = AP with eventual consistency (stale data for 2s is fine). Good systems mix both.
Single-region vs Multi-region: Single region avoids most partition issues (network within a DC is reliable). Multi-region introduces real partitions โ cross-region consensus adds ~100-200ms latency. This is why most teams start single-region.
Conflict resolution strategy: In AP systems, writes can conflict. Last-write-wins (simple, data loss). Merge/CRDT (complex, preserves all data). Application-level (most correct, most work). Amazon's shopping cart used LWW โ occasionally duplicates items, but never loses them.
Quorum tuning: In leaderless systems (Cassandra), you control the tradeoff with W and R values. W=3,R=1 = strong writes, fast reads. W=1,R=3 = fast writes, strong reads. W+R > N = linearizable. Tune per query pattern.
CAP comes up in every distributed system design. The interviewer wants to see you map specific operations to consistency levels.
Interview signal: Don't just say "I'd use CP." Map specific operations: "payment ledger is CP with Raft consensus, user's activity feed is AP with eventual consistency, shopping cart is AP with LWW merge."
CONSISTENCY QUORUM for strong reads, CONSISTENCY ONE for fast reads. Same cluster, different guarantees.| Metric | Value |
|---|---|
| Strong consistency overhead per write (Raft) | +5โ10ms (within region) |
| Cross-region consensus latency | ~100โ200ms (speed of light) |
| Eventual consistency convergence (p99) | <3 seconds typical |
| Network partition frequency | ~2โ3 per year per region |
| Stripe API requests/day | ~1B+ (strong consistency for payments) |
| DynamoDB eventually consistent read | ~1ms (half the cost of strong) |
| Spanner global write latency | ~100ms (TrueTime consensus) |