Python has two main ways to run code concurrently: threading and multiprocessing. The choice depends on whether your work is I/O-bound or CPU-bound.

The GIL Problem

Python's Global Interpreter Lock (GIL) means only one thread executes Python bytecode at a time. This makes threading useless for CPU-bound work but fine for I/O-bound work.

# This won't speed up on multiple cores
def cpu_work():
    return sum(i * i for i in range(10_000_000))
 
# This WILL benefit from threading
def io_work():
    response = requests.get('https://api.example.com')
    return response.json()

Threading: I/O-Bound Work

Use threads when waiting on network, disk, or other I/O:

import threading
import requests
 
def fetch_url(url):
    response = requests.get(url)
    return response.status_code
 
urls = ['https://example.com'] * 10
 
# Sequential: slow
for url in urls:
    fetch_url(url)
 
# Threaded: fast
threads = []
for url in urls:
    t = threading.Thread(target=fetch_url, args=(url,))
    threads.append(t)
    t.start()
 
for t in threads:
    t.join()

Better: use concurrent.futures:

from concurrent.futures import ThreadPoolExecutor
 
with ThreadPoolExecutor(max_workers=10) as executor:
    results = list(executor.map(fetch_url, urls))

Multiprocessing: CPU-Bound Work

Use processes when doing actual computation:

import multiprocessing
 
def cpu_intensive(n):
    return sum(i * i for i in range(n))
 
numbers = [10_000_000] * 4
 
# Sequential: uses one core
results = [cpu_intensive(n) for n in numbers]
 
# Multiprocessing: uses all cores
with multiprocessing.Pool() as pool:
    results = pool.map(cpu_intensive, numbers)

With concurrent.futures:

from concurrent.futures import ProcessPoolExecutor
 
with ProcessPoolExecutor() as executor:
    results = list(executor.map(cpu_intensive, numbers))

Quick Decision Guide

Work TypeUseWhy
Network requeststhreadingWaiting on I/O releases GIL
File I/OthreadingSame—I/O releases GIL
Image processingmultiprocessingCPU-bound, needs real parallelism
Data crunchingmultiprocessingCPU-bound
Web scrapingthreading + asyncioMostly waiting on network
ML trainingmultiprocessingHeavy computation

Common Patterns

Thread Pool for API Calls

from concurrent.futures import ThreadPoolExecutor
import requests
 
def fetch_data(user_id):
    resp = requests.get(f'https://api.example.com/users/{user_id}')
    return resp.json()
 
user_ids = range(1, 101)
 
with ThreadPoolExecutor(max_workers=20) as executor:
    users = list(executor.map(fetch_data, user_ids))

Process Pool for Number Crunching

from concurrent.futures import ProcessPoolExecutor
from pathlib import Path
 
def process_file(filepath):
    data = Path(filepath).read_text()
    # Heavy computation
    return len(data.split())
 
files = list(Path('data').glob('*.txt'))
 
with ProcessPoolExecutor() as executor:
    word_counts = list(executor.map(process_file, files))

Mixed Workload

from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
 
def download(url):
    # I/O-bound
    return requests.get(url).content
 
def process(data):
    # CPU-bound
    return heavy_computation(data)
 
# Download with threads
with ThreadPoolExecutor() as executor:
    raw_data = list(executor.map(download, urls))
 
# Process with processes
with ProcessPoolExecutor() as executor:
    results = list(executor.map(process, raw_data))

Gotchas

Multiprocessing Serialization

Processes need to serialize data between them. This fails:

# BAD: lambda can't be pickled
with ProcessPoolExecutor() as executor:
    results = executor.map(lambda x: x * 2, range(10))

Use named functions instead:

# GOOD
def double(x):
    return x * 2
 
with ProcessPoolExecutor() as executor:
    results = list(executor.map(double, range(10)))

Thread Safety

Threads share memory. Protect shared state:

import threading
 
counter = 0
lock = threading.Lock()
 
def increment():
    global counter
    with lock:
        counter += 1

Process Overhead

Creating processes is expensive. Don't spawn thousands:

# BAD: too many processes
with ProcessPoolExecutor(max_workers=1000) as executor:
    pass
 
# GOOD: match CPU count
with ProcessPoolExecutor() as executor:  # defaults to CPU count
    pass

When to Use asyncio Instead

For pure I/O workloads, asyncio is often better than threading:

import asyncio
import aiohttp
 
async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()
 
async def main():
    async with aiohttp.ClientSession() as session:
        tasks = [fetch(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
 
asyncio.run(main())

Lower overhead than threads, but requires async-compatible libraries.

Summary

  • Threading: I/O-bound work (network, disk)
  • Multiprocessing: CPU-bound work (computation)
  • asyncio: High-concurrency I/O (thousands of connections)

The GIL isn't a bug—it's a design choice. Work with it, not against it.

React to this post: