1. The Question
Design a backend system for a short-video mobile app (TikTok-like). The system should allow users to upload short videos (<= 1 min + caption), view a vertically-scrolling feed (personalized + follow feed), and interact with videos (likes, follows, comments). Focus primarily on backend architecture, data model, API design, and scaling to ~1M daily active users (DAU) with high read volume.
2. Clarifying Questions
- Are we building mobile clients? (Assume client-agnostic REST/gRPC APIs.)
- Feed type: follow-only or personalized? (Support both; core design will enable personalized recommendations via a precache service.)
- Max video length? (Assume <= 1 minute compressed H.264.)
- Which interactions matter? (Likes, follows, comments; basic share/forward optional.)
3. Requirements
Functional:
- Upload video + caption
- Serve a per-user feed (personalized + follow stream)
- Like/follow/comment interactions
- Preload top N videos for low startup latency
Non-functional:
- High availability (target ~99.999%)
- Low read latency for feed playback
- Scale to ~1M DAU, burst to 10x
- Efficient storage & CDN-backed delivery for large video blobs
4. Scale Estimates
- Users: 1,000,000 DAU (assumption)
- Video size: ~5 MB per 1-minute compressed video (H.264)
- Uploads: assume 2 uploads/user/day => 10 MB/day/user
- Daily video ingest: 1,000,000 * 10 MB = 10,000,000 MB = 10 TB/day
- Monthly raw storage (30d): ~300 TB (before redundancy/replication/transcodes)
- Read-heavy: feed reads >> writes; concurrency spikes possible (viral video)
5. Data Model
Use a relational DB for structured metadata + separate tables for activity. Example tables:
- users(user_id PK, username, profile_meta, created_at)
- videos(video_id PK, user_id FK, s3_url, thumbnail_url, caption, length_sec, codec_meta, created_at)
- follows(follower_id, followee_id, created_at) -- indexed by follower_id
- likes(user_id, video_id, created_at) -- append-heavy
- comments(comment_id, video_id, user_id, text, created_at)
- feed_cache(user_id, playlist[] or pointer, updated_at) -- for precached playlists
Store large binary blobs (video, thumbnails) in object storage (S3-compatible). Keep metadata in the DB. Use time-series / analytics store for metrics/logs.
6. API Design
Key endpoints (HTTP/REST or gRPC):
-
POST /upload/video
- payload: video multipart or pre-signed URL upload; caption, user_id, metadata
- flow: client requests presigned upload -> upload to object store -> notify metadata service -> persist video record
-
GET /feed?user_id={uid}&cursor={c}&limit={n}
- returns ordered list of video metadata + CDN URLs (first N preloaded)
-
POST /video/{id}/like
- body: user_id; writes to likes table + activity log
-
POST /user/{id}/follow
- body: follower_id, followee_id
-
GET /user/{id}/activity (likes/follows)
Notes: Use presigned PUT to offload video upload bandwidth from API servers. Keep metadata write path lightweight.
7. High-Level Architecture
Components:
- Mobile client
- API gateway / load balancer -> API service fleet (stateless) behind LB
- Auth service (token issuance/validation)
- Object storage (S3) for raw videos + transcoded variants
- CDN (Akamai/CloudFront) in front of object storage for low-latency video delivery
- Relational primary DB for metadata (write master) + read replicas
- Redis or Memcached for per-user cache and hot-object caching
- Precache service / feed generation workers that build personalized playlists and populate Redis
- Message queue (Kafka/SQS) for async tasks: transcode, analytics, feed updates, notifications
- Transcoding service to generate multiple bitrates and thumbnails
- Read replicas / regionized deployments and sharding layer
Flow highlights:
- Upload: client -> presigned S3 PUT -> notify API -> persist metadata -> enqueue transcode -> push to CDN
- Feed fetch: client -> API -> read Redis precache -> return metadata & CDN URLs; client fetches blobs from CDN
8. Detailed Design Decisions
- Metadata DB: Relational (Postgres) for structured relationships (users->videos->comments). Use read replicas to separate reads from writes.
- Blob storage: S3-compatible object store for large binary; cheap, durable, integrated with CDN.
- Caching & precache: Precompute personalized playlists into Redis per user (top N videos). Reduces on-the-fly compute and DB load.
- CDN: Critical to handle viral spikes and reduce origin bandwidth. Cache TTLs depend on content immutability.
- Upload flow: Use presigned URLs so API servers don't host video uploads.
- Transcoding: Async pipeline consuming uploads from queue; store variants and update metadata when ready.
- Sharding: Shard write DB by user_id or by region to distribute write load at scale.
- Consistency: Eventual consistency acceptable for feeds; likes/follows must be persisted but can propagate to caches asynchronously.
9. Bottlenecks & Scaling
- Video delivery: origin bandwidth and latency — mitigate with CDN + multiple edge POPs.
- DB write throughput: high ingest of metadata and activity; mitigate with sharding, write scaling, and partitioning.
- Feed generation: expensive personalization at scale; mitigate by precaching, incremental updates, and approximate algorithms.
- Hot objects (viral videos): cache hotspots on CDN and edge caches; use rate-limiting and request collapsing at origin.
- Transcoding pipeline: scale horizontally; use autoscaling workers and spot instances for cost efficiency.
- Cache invalidation: ensure user actions (like, follow) update precache or are merged at read time; use TTLs and async invalidation.
10. Follow-up Questions / Extensions
- Recommendation engine: offline batch + online scoring; features from user interactions, video embeddings, collaborative filtering.
- Personalization freshness: how to balance new uploads appearing in feed vs. stability of feed
- Moderation: automated content moderation (ML) + human review; policy for removed content and cache invalidation
- Multi-region deployment: geo-routing, data residency, replication
- Analytics & metrics: realtime dashboards, A/B testing, retention tracking
- Cost optimization: long-term cold storage, TTLs for inactive videos, transcode on-demand for rarely viewed bitrates
11. Wrap-up
Design focuses on decoupling large binary delivery (object store + CDN) from metadata (relational DB), using caching and precache services to handle heavy read traffic and low-latency feed delivery, and asynchronous pipelines for upload/transcode/analytics. Critical scaling levers: caching, CDNs, DB sharding, read replicas, and precomputed personalized feeds.