โš ๏ธ This guide is AI-generated and may contain inaccuracies. Always verify against authoritative sources and real-world documentation.

Architecture Diagram โ€” Cache-Aside Pattern

Client Request App Server Business Logic Cache Client โ‘  Check Cache (Redis) In-Memory ยท ฮผs latency HIT โœ“ โ‘ก MISS โ†’ DB Database PostgreSQL / MySQL โ‘ข Result โ‘ฃ Populate CACHE-ASIDE FLOW โ‘  App checks cache โ‘ก Miss โ†’ query DB โ‘ขโ‘ฃ Write result to cache

How It Works

The application sits between the cache and the database. On every read, it first checks the cache. On a hit, data is returned instantly (~ฮผs). On a miss, the app queries the database, stores the result in the cache, and returns it to the client.

Caching Strategies

Cache-Aside (Lazy Loading)

App manages the cache explicitly. Read: check cache โ†’ miss โ†’ read DB โ†’ write cache. Write: update DB โ†’ invalidate cache. Most common strategy. Risk: stale data if invalidation fails.

Write-Through

Every write goes to cache AND database synchronously. Guarantees cache consistency. Downside: write latency increases (two writes per operation). Good for read-heavy workloads.

Write-Behind (Write-Back)

Write to cache immediately, async flush to DB. Ultra-low write latency. Risk: data loss if cache node dies before flush. Used in hardware (CPU caches) and disk controllers.

Read-Through

Cache itself handles DB reads on a miss (cache sits in front of DB). Simplifies app code โ€” it only talks to cache. Requires cache library support (e.g., NCache, Hazelcast).

Eviction Policies

  • LRU (Least Recently Used) โ€” Evict the item accessed longest ago. Default in Redis. Good general-purpose choice.
  • LFU (Least Frequently Used) โ€” Evict the item accessed fewest times. Better for skewed access patterns.
  • TTL (Time to Live) โ€” Expire entries after a fixed duration. Simple, prevents unbounded staleness.
  • Random โ€” Surprisingly close to LRU performance with zero tracking overhead.

Cache Stampede Prevention

When a popular cache key expires, hundreds of requests simultaneously hit the DB. Solutions:

  • Locking: Only one request fetches from DB; others wait for the cache to be repopulated.
  • Early expiration (jitter): Randomly refresh before TTL expires. Each key gets TTL ยฑ random offset.
  • Stale-while-revalidate: Serve stale data while one background thread refreshes.

Key Design Decisions

๐Ÿงฉ

Local cache vs Distributed cache: Local (in-process) cache is fastest (~ns) but not shared across instances โ€” leads to inconsistency. Distributed cache (Redis, Memcached) adds ~1ms network hop but is shared and consistent. Most systems use both: L1 local + L2 distributed.

โฑ๏ธ

TTL tuning: Too short = high miss rate, defeating the purpose. Too long = stale data. Match TTL to how often the underlying data changes. User profiles: minutes. Stock prices: seconds. Static assets: hours/days.

๐Ÿ—‘๏ธ

Invalidation vs TTL: Active invalidation (delete on write) gives freshness but adds complexity. TTL is simpler but allows staleness windows. Phil Karlton: "There are only two hard things in CS: cache invalidation and naming things."

๐Ÿ’พ

Redis vs Memcached: Redis: data structures (lists, sets, sorted sets), persistence, pub/sub, Lua scripting. Memcached: simpler, multi-threaded, slightly faster for plain key-value. Redis wins for 90% of use cases.

When to Use

Caching is relevant in almost every system design interview. Mention it whenever you see:

  • Read-heavy workloads โ€” "Design a news feed" โ†’ cache pre-computed feeds
  • Expensive computations โ€” "Design a search engine" โ†’ cache query results
  • Hot data โ€” "Design a URL shortener" โ†’ cache popular short URLs
  • Rate limiting โ€” Redis as a counter store with TTL
  • Session storage โ€” User sessions in Redis instead of server memory

Interview signal: The interviewer wants to hear you discuss what to cache, where to cache it, when to invalidate, and how to handle failures (cache down โ‰  system down).

Real-World Examples

  • Facebook (TAO) โ€” Custom distributed cache for social graph. Billions of reads/sec with ~1ms p99. Write-through to MySQL.
  • Twitter โ€” Redis for timeline caching. Fan-out on write: precompute home timelines into per-user caches.
  • Netflix โ€” EVCache (Memcached-based) for session data, personalization, and video metadata. 30M+ ops/sec.
  • Stack Overflow โ€” Serves 1B+ page views/month with only 9 web servers, thanks to aggressive Redis + in-memory caching.

Back-of-Envelope Numbers

MetricValue
Redis GET latency~0.1โ€“0.5 ms
Redis throughput (single node)~100Kโ€“200K ops/sec
Memcached throughput~200Kโ€“700K ops/sec
L1 CPU cache access~1 ns
In-process cache (HashMap)~50โ€“100 ns
Redis (same AZ, network)~0.1โ€“1 ms
Database query (indexed)~1โ€“10 ms
Database query (full scan)~100โ€“1000 ms
Typical cache hit ratio (healthy)95โ€“99%