On this page
article
Interview: System Design Basics
Python system design interview basics — scalable APIs, caching, load balancing, databases, and message queues.
System design interviews test your ability to architect scalable solutions. You don’t need to know every technology — focus on trade-offs and clear reasoning.
Framework for Any Design Question
- Clarify requirements — functional and non-functional (scale, latency, availability)
- Estimate scale — users, requests/sec, data size
- High-level design — draw boxes and arrows
- Deep dive — pick 2–3 components to detail
- Identify bottlenecks — and propose solutions
- Discuss trade-offs — why this approach over alternatives
Example: Design a URL Shortener
Requirements
- Shorten long URLs to 6-character codes
- Redirect on access
- Track click counts
- 100M URLs, 1000 redirects/sec
High-Level Design
Client → Load Balancer → API Servers → Database
↓
Cache (Redis)
Database Schema
CREATE TABLE links (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
original_url TEXT NOT NULL,
clicks INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_short_code ON links(short_code);
Short Code Generation
- Base62 encoding of auto-increment ID (a-z, A-Z, 0-9)
- 6 chars = 62^6 ≈ 56 billion unique codes
- Alternative: random + collision check
Caching Strategy
Redirect request:
1. Check Redis for short_code → URL
2. Cache hit → redirect immediately
3. Cache miss → query DB, populate cache, redirect
4. Async: increment click counter (don't block redirect)
Scaling
| Component | Scale Strategy |
|---|---|
| API servers | Horizontal — stateless, add instances |
| Database | Read replicas, shard by short_code hash |
| Cache | Redis cluster, TTL for less popular links |
| Static assets | CDN |
Common Components
Load Balancer
Distributes traffic across servers. Options: AWS ALB, Nginx, HAProxy.
Caching
- Redis/Memcached — in-memory key-value store
- Cache-aside pattern: read cache → miss → read DB → write cache
- Set TTL to prevent stale data
Message Queue
Decouple services for async processing:
- AWS SQS, RabbitMQ, Kafka
- Use for: email sending, analytics, image processing
API → SQS → Worker Lambda → Process async
Database Choices
| Type | Use Case | Examples |
|---|---|---|
| Relational (SQL) | Structured data, transactions | PostgreSQL, MySQL |
| Document (NoSQL) | Flexible schema, horizontal scale | MongoDB, DynamoDB |
| Key-Value | Caching, sessions | Redis, DynamoDB |
| Search | Full-text search | Elasticsearch |
CDN
Cache static content (images, JS, CSS) at edge locations close to users.
Python-Specific Architecture
┌─────────────┐
│ Nginx │
└──────┬──────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Gunicorn │ │ Gunicorn │ │ Gunicorn │
│ FastAPI │ │ FastAPI │ │ FastAPI │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└────────────┼────────────┘
▼
┌─────────────┐
│ PostgreSQL │
│ (primary) │
└──────┬──────┘
│
┌──────┴──────┐
▼ ▼
┌──────────┐ ┌──────────┐
│ Replica │ │ Redis │
└──────────┘ └──────────┘
See Flask URL Shortener Project for a simplified implementation.
Key Trade-offs to Discuss
| Decision | Option A | Option B |
|---|---|---|
| Consistency vs availability | Strong consistency (SQL) | Eventual consistency (NoSQL) |
| Sync vs async | Simple, immediate | Scalable, complex |
| Monolith vs microservices | Faster to build | Independent scaling |
| SQL vs NoSQL | ACID, joins | Flexible schema, scale-out |
| Push vs pull CDN | Real-time updates | Simpler architecture |
CAP Theorem (Brief)
In a distributed system, you can guarantee at most two of:
- Consistency — all nodes see same data
- Availability — every request gets a response
- Partition tolerance — system works despite network failures
Most modern systems choose AP (availability + partition tolerance) with eventual consistency.