โš ๏ธ This guide is AI-generated and may contain inaccuracies. Always verify against authoritative sources and real-world documentation.

Architecture Diagram

๐Ÿ“ก Event Like / Comment NOTIFICATION SERVICE Rate limiting Dedup ยท Priority โš™๏ธ User Prefs Channel / Quiet hours ๐Ÿ“ฑ Push APNs (iOS) FCM (Android) ๐Ÿ“ง Email SMTP / SES / SendGrid ๐Ÿ’ฌ SMS Twilio / SNS ๐Ÿ”” In-App WebSocket / SSE ๐Ÿ‘ค User Multi-channel โšก Aggregate: "X and 999 others liked"

How It Works

A notification system pushes data to users across multiple channels without them explicitly requesting it. The core challenge is fan-out at scale, multi-channel delivery, and respecting user preferences.

Real-Time Delivery Techniques

WebSocket / SSE (In-App)

Persistent bidirectional connection (WebSocket) or server-to-client stream (SSE). Instant delivery when app is open. Each connection = memory + file descriptor on server. 1M connections = dedicated connection servers.

Push Notifications (Mobile)

OS-level channel via APNs (iOS) or FCM (Android). Delivered even when app is closed. Best-effort delivery โ€” not guaranteed. Use as a hint; client fetches actual data via API.

Notification Pipeline

  1. Event triggers โ€” User action (like, comment, follow) generates an event, published to a message queue (Kafka).
  2. Notification service โ€” Consumes events, applies deduplication, rate limiting, and priority rules.
  3. User preferences filter โ€” Checks per-user settings: which channels enabled, quiet hours, notification categories.
  4. Channel fan-out โ€” Routes to appropriate delivery services: push, email, SMS, in-app (WebSocket).
  5. Delivery + tracking โ€” Each channel delivers independently. Track delivery status, opens, and failures.

Fan-Out for Celebrity Events

When a post gets 1M likes, you don't send 1M push notifications to the author. Use aggregation: "X and 999,999 others liked your photo." Batch similar notifications within a time window (30-60 seconds) before delivering.

Key Design Decisions

๐Ÿ“ก

Push vs Pull vs Hybrid: Pure push (WebSocket) is instant but requires maintaining millions of persistent connections. Pure pull (polling) wastes resources. Best approach: push when online (instant) + push notification as wake-up signal when offline (no polling).

๐Ÿ””

Aggregation window: Send immediately (good UX for 1:1 messages) vs batch and aggregate (good for high-volume events like likes). Solution: categorize by priority. Messages = instant. Likes = aggregate over 30s. Marketing = batch hourly.

๐Ÿšฆ

Per-user rate limiting: No notification spam. Limit to N notifications per hour per user per channel. Priority queue: critical (payment confirmation) > social (likes) > marketing (recommendations).

๐Ÿ’พ

At-least-once vs at-most-once: Push notifications are best-effort (at-most-once from OS perspective). For critical notifications (payment, security), use multiple channels + confirmation tracking. Don't rely on a single channel for important data.

When to Use

Notification systems appear in almost every social/commerce/communication system design.

  • "Design a notification system" โ€” Multi-channel architecture: in-app + push + email + SMS with user preferences
  • "How does the chat app update in real-time?" โ€” WebSocket for online, push notification for offline wake-up
  • "How do you notify 1M followers when a celebrity posts?" โ€” Fan-out via Kafka + aggregation to avoid notification spam
  • "Design Instagram notifications" โ€” Priority tiers, aggregation ("X and 99 others"), quiet hours

Interview signal: Structure notifications as a multi-channel problem. Mention fan-out, aggregation, rate limiting per user, and delivery guarantees unprompted.

Real-World Examples

  • WhatsApp โ€” Persistent TCP connection per client. Message delivered in ~50ms when online. Offline: stored in transient storage + APNs/FCM wake-up. Messages deleted from server after delivery ACK.
  • Slack โ€” WebSocket for real-time delivery in-app. Falls back to long polling when WebSocket fails (corporate proxies). Push notifications for mobile when app is backgrounded.
  • Instagram โ€” Aggregation: "user and 999,999 others liked your photo" instead of 1M separate notifications. Priority tiers: DMs > comments > likes > suggestions.
  • Uber โ€” Multi-channel for trip updates: in-app (WebSocket) + push (FCM/APNs) + SMS (fallback for unreliable networks). Delivery receipts: โœ“ sent, โœ“โœ“ delivered, โœ“โœ“ read.

Back-of-Envelope Numbers

Metric Value
WebSocket connections per server~1M concurrent (C10M)
WhatsApp message delivery (online)~50 ms p50
WhatsApp message delivery (offline โ†’ push)~5 seconds p99
WhatsApp daily messages~100B messages/day โ‰ˆ 1.15M/sec
Push notification volume (WhatsApp)~30B push/day
Connection servers needed (2B users)~2,000 servers
Aggregation window (likes)30โ€“60 seconds
SMS cost (Twilio)~$0.0075/message