to navigate

to select

to close

On this page

Interview: System Design Basics

Python system design interview basics — scalable APIs, caching, load balancing, databases, and message queues.

System design interviews test your ability to architect scalable solutions. You don’t need to know every technology — focus on trade-offs and clear reasoning.

Framework for Any Design Question

Clarify requirements — functional and non-functional (scale, latency, availability)
Estimate scale — users, requests/sec, data size
High-level design — draw boxes and arrows
Deep dive — pick 2–3 components to detail
Identify bottlenecks — and propose solutions
Discuss trade-offs — why this approach over alternatives

Example: Design a URL Shortener

Requirements

Shorten long URLs to 6-character codes
Redirect on access
Track click counts
100M URLs, 1000 redirects/sec

High-Level Design

  Client → Load Balancer → API Servers → Database
                              ↓
                           Cache (Redis)

Database Schema

  CREATE TABLE links (
    id          BIGSERIAL PRIMARY KEY,
    short_code  VARCHAR(10) UNIQUE NOT NULL,
    original_url TEXT NOT NULL,
    clicks      INTEGER DEFAULT 0,
    created_at  TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_short_code ON links(short_code);

Short Code Generation

Base62 encoding of auto-increment ID (a-z, A-Z, 0-9)
6 chars = 62^6 ≈ 56 billion unique codes
Alternative: random + collision check

Caching Strategy

  Redirect request:
1. Check Redis for short_code → URL
2. Cache hit → redirect immediately
3. Cache miss → query DB, populate cache, redirect
4. Async: increment click counter (don't block redirect)

Scaling

Component	Scale Strategy
API servers	Horizontal — stateless, add instances
Database	Read replicas, shard by short_code hash
Cache	Redis cluster, TTL for less popular links
Static assets	CDN

Common Components

Load Balancer

Distributes traffic across servers. Options: AWS ALB, Nginx, HAProxy.

Caching

Redis/Memcached — in-memory key-value store
Cache-aside pattern: read cache → miss → read DB → write cache
Set TTL to prevent stale data

Message Queue

Decouple services for async processing:

AWS SQS, RabbitMQ, Kafka
Use for: email sending, analytics, image processing

  API → SQS → Worker Lambda → Process async

Database Choices

Type	Use Case	Examples
Relational (SQL)	Structured data, transactions	PostgreSQL, MySQL
Document (NoSQL)	Flexible schema, horizontal scale	MongoDB, DynamoDB
Key-Value	Caching, sessions	Redis, DynamoDB
Search	Full-text search	Elasticsearch

CDN

Cache static content (images, JS, CSS) at edge locations close to users.

Python-Specific Architecture

                      ┌─────────────┐
                    │   Nginx     │
                    └──────┬──────┘
                           │
              ┌────────────┼────────────┐
              ▼            ▼            ▼
        ┌──────────┐ ┌──────────┐ ┌──────────┐
        │ Gunicorn │ │ Gunicorn │ │ Gunicorn │
        │ FastAPI  │ │ FastAPI  │ │ FastAPI  │
        └────┬─────┘ └────┬─────┘ └────┬─────┘
             │            │            │
             └────────────┼────────────┘
                          ▼
                   ┌─────────────┐
                   │ PostgreSQL  │
                   │  (primary)  │
                   └──────┬──────┘
                          │
                   ┌──────┴──────┐
                   ▼             ▼
             ┌──────────┐  ┌──────────┐
             │ Replica  │  │  Redis   │
             └──────────┘  └──────────┘

See Flask URL Shortener Project for a simplified implementation.

Key Trade-offs to Discuss

Decision	Option A	Option B
Consistency vs availability	Strong consistency (SQL)	Eventual consistency (NoSQL)
Sync vs async	Simple, immediate	Scalable, complex
Monolith vs microservices	Faster to build	Independent scaling
SQL vs NoSQL	ACID, joins	Flexible schema, scale-out
Push vs pull CDN	Real-time updates	Simpler architecture

CAP Theorem (Brief)

In a distributed system, you can guarantee at most two of:

Consistency — all nodes see same data
Availability — every request gets a response
Partition tolerance — system works despite network failures

Most modern systems choose AP (availability + partition tolerance) with eventual consistency.

Interview: Python Fundamentals

Common Python interview questions and …

Interview: Web & Backend

Backend interview questions — REST API …

Interview: System Design Basics

Framework for Any Design Question link

Example: Design a URL Shortener link

Requirements link

High-Level Design link

Database Schema link

Short Code Generation link

Caching Strategy link

Scaling link

Common Components link

Load Balancer link

Caching link

Message Queue link

Database Choices link

CDN link

Python-Specific Architecture link

Key Trade-offs to Discuss link

CAP Theorem (Brief) link

Related link