Python offers several ways to run code concurrently. Choosing the right model depends on whether your bottleneck is I/O or CPU.

The Global Interpreter Lock (GIL)

CPython’s GIL allows only one thread to execute Python bytecode at a time. This means threads don’t parallelize CPU-bound work — but they still help with I/O-bound tasks that release the GIL (file I/O, network calls, NumPy operations).

Threading — I/O-Bound Tasks

Use threading when tasks spend time waiting (network, disk):

  import threading
import requests

def fetch(url, results, index):
    response = requests.get(url, timeout=10)
    results[index] = len(response.text)

urls = ["https://example.com"] * 5
results = [None] * len(urls)
threads = []

for i, url in enumerate(urls):
    t = threading.Thread(target=fetch, args=(url, results, i))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(results)
  

Thread-Safe Data with Locks

  import threading

counter = 0
lock = threading.Lock()

def increment():
    global counter
    with lock:
        counter += 1

threads = [threading.Thread(target=increment) for _ in range(1000)]
for t in threads:
    t.start()
for t in threads:
    t.join()

print(counter)  # 1000
  

Multiprocessing — CPU-Bound Tasks

Each process has its own Python interpreter and memory, bypassing the GIL:

  from multiprocessing import Pool

def square(n):
    return n ** 2

if __name__ == "__main__":
    with Pool(processes=4) as pool:
        results = pool.map(square, range(10))
    print(results)
  

Important: Always guard multiprocessing entry points with if __name__ == "__main__": on Windows and macOS.

concurrent.futures provides a higher-level interface for both models:

  from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
import time

def io_task(n):
    time.sleep(1)
    return n

def cpu_task(n):
    return sum(i * i for i in range(n))

# I/O-bound → ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(io_task, range(4)))

# CPU-bound → ProcessPoolExecutor
with ProcessPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(cpu_task, [10**6] * 4))
  

Choosing a Concurrency Model

Model Best For Bypasses GIL?
threading I/O-bound (HTTP, files) No (but GIL released during I/O)
multiprocessing CPU-bound (math, parsing) Yes
asyncio Many concurrent I/O connections N/A (single thread)
ProcessPoolExecutor CPU-bound with clean API Yes
ThreadPoolExecutor I/O-bound with clean API No

Practical Example: Download Files Concurrently

  from concurrent.futures import ThreadPoolExecutor, as_completed
import requests

URLS = [
    "https://httpbin.org/delay/1",
    "https://httpbin.org/delay/2",
    "https://httpbin.org/delay/1",
]

def download(url):
    resp = requests.get(url, timeout=30)
    return url, resp.status_code

with ThreadPoolExecutor(max_workers=3) as executor:
    futures = {executor.submit(download, url): url for url in URLS}
    for future in as_completed(futures):
        url, status = future.result()
        print(f"{url} → {status}")
  

Summary

  • I/O-boundasyncio, threading, or ThreadPoolExecutor
  • CPU-boundmultiprocessing or ProcessPoolExecutor
  • Never mix heavy CPU work inside async coroutines without offloading to a thread/process pool

Understanding these trade-offs is essential for building performant Python applications at scale.