Python multiprocessing Module

Python's multiprocessing module provides true parallelism by spawning separate processes, each with its own Python interpreter and memory space. This bypasses the GIL (Global Interpreter Lock) that limits threads to one CPU core.

When to Use multiprocessing

CPU-bound work: Number crunching, image processing, data transformation
Embarrassingly parallel: Same operation on many independent inputs
Not for I/O-bound: Use asyncio or threading instead

Basic Process Creation

from multiprocessing import Process
import os
 
def worker(name):
    print(f"Worker {name}, PID: {os.getpid()}")
 
if __name__ == "__main__":
    processes = []
    for i in range(4):
        p = Process(target=worker, args=(i,))
        processes.append(p)
        p.start()
    
    for p in processes:
        p.join()  # Wait for completion

Always use if __name__ == "__main__": to prevent infinite process spawning.

Process Pools

For many tasks, Pool manages worker processes automatically:

from multiprocessing import Pool
 
def square(n):
    return n * n
 
if __name__ == "__main__":
    with Pool(4) as pool:  # 4 worker processes
        results = pool.map(square, range(10))
        print(results)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Pool Methods

from multiprocessing import Pool
 
def process_item(x):
    return x * 2
 
if __name__ == "__main__":
    with Pool() as pool:  # Uses cpu_count() by default
        # map: ordered results, blocks until done
        results = pool.map(process_item, range(5))
        
        # imap: lazy iterator, memory efficient
        for result in pool.imap(process_item, range(5)):
            print(result)
        
        # imap_unordered: results as they complete
        for result in pool.imap_unordered(process_item, range(5)):
            print(result)
        
        # apply_async: non-blocking single call
        future = pool.apply_async(process_item, (42,))
        print(future.get())  # 84

Processes don't share memory, but Queue provides safe message passing:

from multiprocessing import Process, Queue
 
def producer(queue):
    for i in range(5):
        queue.put(f"item-{i}")
    queue.put(None)  # Sentinel
 
def consumer(queue):
    while True:
        item = queue.get()
        if item is None:
            break
        print(f"Got: {item}")
 
if __name__ == "__main__":
    queue = Queue()
    p1 = Process(target=producer, args=(queue,))
    p2 = Process(target=consumer, args=(queue,))
    
    p1.start()
    p2.start()
    p1.join()
    p2.join()

For simple shared state:

from multiprocessing import Process, Value, Array
 
def increment(counter, arr):
    counter.value += 1
    for i in range(len(arr)):
        arr[i] *= 2
 
if __name__ == "__main__":
    counter = Value('i', 0)  # 'i' = int
    arr = Array('d', [1.0, 2.0, 3.0])  # 'd' = double
    
    processes = [Process(target=increment, args=(counter, arr)) 
                 for _ in range(4)]
    
    for p in processes:
        p.start()
    for p in processes:
        p.join()
    
    print(f"Counter: {counter.value}")
    print(f"Array: {list(arr)}")

Locks for Synchronization

from multiprocessing import Process, Lock, Value
 
def safe_increment(lock, counter, n):
    for _ in range(n):
        with lock:
            counter.value += 1
 
if __name__ == "__main__":
    lock = Lock()
    counter = Value('i', 0)
    
    processes = [
        Process(target=safe_increment, args=(lock, counter, 1000))
        for _ in range(4)
    ]
    
    for p in processes:
        p.start()
    for p in processes:
        p.join()
    
    print(f"Final: {counter.value}")  # 4000

ProcessPoolExecutor (concurrent.futures)

Higher-level interface, similar to ThreadPoolExecutor:

from concurrent.futures import ProcessPoolExecutor
import time
 
def heavy_computation(n):
    total = sum(i * i for i in range(n))
    return total
 
if __name__ == "__main__":
    numbers = [10_000_000, 20_000_000, 30_000_000, 40_000_000]
    
    start = time.time()
    with ProcessPoolExecutor(max_workers=4) as executor:
        results = list(executor.map(heavy_computation, numbers))
    
    print(f"Time: {time.time() - start:.2f}s")
    print(f"Results: {results}")

Handling Exceptions

from multiprocessing import Pool
 
def risky_operation(x):
    if x == 3:
        raise ValueError("I don't like 3")
    return x * 2
 
if __name__ == "__main__":
    with Pool(2) as pool:
        try:
            results = pool.map(risky_operation, range(5))
        except ValueError as e:
            print(f"Caught: {e}")

Practical Example: Parallel File Processing

from multiprocessing import Pool
from pathlib import Path
import hashlib
 
def compute_hash(filepath):
    h = hashlib.md5()
    with open(filepath, 'rb') as f:
        for chunk in iter(lambda: f.read(8192), b''):
            h.update(chunk)
    return (filepath, h.hexdigest())
 
if __name__ == "__main__":
    files = list(Path('.').glob('**/*.py'))
    
    with Pool() as pool:
        results = pool.map(compute_hash, files)
    
    for path, hash_val in results:
        print(f"{path}: {hash_val[:8]}...")

Performance Tips

Chunk size matters: pool.map(func, items, chunksize=100) reduces IPC overhead
Pickle everything: Arguments and return values must be picklable
Avoid shared state: Message passing > shared memory for complex data
Process creation is slow: Reuse pools rather than spawning per-task

multiprocessing vs threading

Aspect	multiprocessing	threading
GIL	Bypassed	Limited by GIL
Best for	CPU-bound	I/O-bound
Memory	Separate	Shared
Overhead	Higher	Lower
Communication	IPC (Queue)	Direct

Summary

multiprocessing unlocks true parallelism in Python. Use Pool for simple parallel map operations, Queue for message passing, and Value/Array with Lock for shared state. For CPU-intensive work that needs all your cores, this is the module to reach for.

React to this post:

#When to Use multiprocessing

#Basic Process Creation

#Process Pools

#Pool Methods

#Sharing Data: Queue

#Sharing Data: Value and Array

#Locks for Synchronization

#ProcessPoolExecutor (concurrent.futures)

#Handling Exceptions

#Practical Example: Parallel File Processing

#Performance Tips

#multiprocessing vs threading

#Summary

Keep Reading

Need help shipping fast?