MockMe.ai | AI-Powered System Design Interview Practice

1. The Question

Design a backend system for a short-video mobile app (TikTok-like). The system should allow users to upload short videos (<= 1 min + caption), view a vertically-scrolling feed (personalized + follow feed), and interact with videos (likes, follows, comments). Focus primarily on backend architecture, data model, API design, and scaling to ~1M daily active users (DAU) with high read volume.

2. Clarifying Questions

Are we building mobile clients? (Assume client-agnostic REST/gRPC APIs.)
Feed type: follow-only or personalized? (Support both; core design will enable personalized recommendations via a precache service.)
Max video length? (Assume <= 1 minute compressed H.264.)
Which interactions matter? (Likes, follows, comments; basic share/forward optional.)

3. Requirements

Functional:

Upload video + caption
Serve a per-user feed (personalized + follow stream)
Like/follow/comment interactions
Preload top N videos for low startup latency

Non-functional:

High availability (target ~99.999%)
Low read latency for feed playback
Scale to ~1M DAU, burst to 10x
Efficient storage & CDN-backed delivery for large video blobs

4. Scale Estimates

Users: 1,000,000 DAU (assumption)
Video size: ~5 MB per 1-minute compressed video (H.264)
Uploads: assume 2 uploads/user/day => 10 MB/day/user
Daily video ingest: 1,000,000 * 10 MB = 10,000,000 MB = 10 TB/day
Monthly raw storage (30d): ~300 TB (before redundancy/replication/transcodes)
Read-heavy: feed reads >> writes; concurrency spikes possible (viral video)

5. Data Model

Use a relational DB for structured metadata + separate tables for activity. Example tables:

users(user_id PK, username, profile_meta, created_at)
videos(video_id PK, user_id FK, s3_url, thumbnail_url, caption, length_sec, codec_meta, created_at)
follows(follower_id, followee_id, created_at) -- indexed by follower_id
likes(user_id, video_id, created_at) -- append-heavy
comments(comment_id, video_id, user_id, text, created_at)
feed_cache(user_id, playlist[] or pointer, updated_at) -- for precached playlists

Store large binary blobs (video, thumbnails) in object storage (S3-compatible). Keep metadata in the DB. Use time-series / analytics store for metrics/logs.

6. API Design

Key endpoints (HTTP/REST or gRPC):

POST /upload/video
- payload: video multipart or pre-signed URL upload; caption, user_id, metadata
- flow: client requests presigned upload -> upload to object store -> notify metadata service -> persist video record
GET /feed?user_id={uid}&cursor={c}&limit={n}
- returns ordered list of video metadata + CDN URLs (first N preloaded)
POST /video/{id}/like
- body: user_id; writes to likes table + activity log
POST /user/{id}/follow
- body: follower_id, followee_id
GET /user/{id}/activity (likes/follows)

Notes: Use presigned PUT to offload video upload bandwidth from API servers. Keep metadata write path lightweight.

7. High-Level Architecture

Components:

Mobile client
API gateway / load balancer -> API service fleet (stateless) behind LB
Auth service (token issuance/validation)
Object storage (S3) for raw videos + transcoded variants
CDN (Akamai/CloudFront) in front of object storage for low-latency video delivery
Relational primary DB for metadata (write master) + read replicas
Redis or Memcached for per-user cache and hot-object caching
Precache service / feed generation workers that build personalized playlists and populate Redis
Message queue (Kafka/SQS) for async tasks: transcode, analytics, feed updates, notifications
Transcoding service to generate multiple bitrates and thumbnails
Read replicas / regionized deployments and sharding layer

Flow highlights:

Upload: client -> presigned S3 PUT -> notify API -> persist metadata -> enqueue transcode -> push to CDN
Feed fetch: client -> API -> read Redis precache -> return metadata & CDN URLs; client fetches blobs from CDN

8. Detailed Design Decisions

Metadata DB: Relational (Postgres) for structured relationships (users->videos->comments). Use read replicas to separate reads from writes.
Blob storage: S3-compatible object store for large binary; cheap, durable, integrated with CDN.
Caching & precache: Precompute personalized playlists into Redis per user (top N videos). Reduces on-the-fly compute and DB load.
CDN: Critical to handle viral spikes and reduce origin bandwidth. Cache TTLs depend on content immutability.
Upload flow: Use presigned URLs so API servers don't host video uploads.
Transcoding: Async pipeline consuming uploads from queue; store variants and update metadata when ready.
Sharding: Shard write DB by user_id or by region to distribute write load at scale.
Consistency: Eventual consistency acceptable for feeds; likes/follows must be persisted but can propagate to caches asynchronously.

9. Bottlenecks & Scaling

Video delivery: origin bandwidth and latency — mitigate with CDN + multiple edge POPs.
DB write throughput: high ingest of metadata and activity; mitigate with sharding, write scaling, and partitioning.
Feed generation: expensive personalization at scale; mitigate by precaching, incremental updates, and approximate algorithms.
Hot objects (viral videos): cache hotspots on CDN and edge caches; use rate-limiting and request collapsing at origin.
Transcoding pipeline: scale horizontally; use autoscaling workers and spot instances for cost efficiency.
Cache invalidation: ensure user actions (like, follow) update precache or are merged at read time; use TTLs and async invalidation.

10. Follow-up Questions / Extensions

Recommendation engine: offline batch + online scoring; features from user interactions, video embeddings, collaborative filtering.
Personalization freshness: how to balance new uploads appearing in feed vs. stability of feed
Moderation: automated content moderation (ML) + human review; policy for removed content and cache invalidation
Multi-region deployment: geo-routing, data residency, replication
Analytics & metrics: realtime dashboards, A/B testing, retention tracking
Cost optimization: long-term cold storage, TTLs for inactive videos, transcode on-demand for rarely viewed bitrates

11. Wrap-up

Design focuses on decoupling large binary delivery (object store + CDN) from metadata (relational DB), using caching and precache services to handle heavy read traffic and low-latency feed delivery, and asynchronous pipelines for upload/transcode/analytics. Critical scaling levers: caching, CDNs, DB sharding, read replicas, and precomputed personalized feeds.

Design TikTok