I stumbled onto weakref while debugging a memory leak in a caching system. The cache was supposed to improve performance, but instead it was hoarding objects forever, slowly eating RAM until the process crashed.

The fix? One import and about five lines of code. But understanding why it worked took me down a rabbit hole of Python memory management. Here's what I learned.

The Problem With Regular References

In Python, when you assign an object to a variable, you create a reference to it. The object stays alive as long as at least one reference exists:

class ExpensiveObject:
    def __init__(self, name):
        self.name = name
        print(f"Created {name}")
    
    def __del__(self):
        print(f"Destroyed {self.name}")
 
# Create an object
obj = ExpensiveObject("heavy-data")
# Output: Created heavy-data
 
# Object stays alive because 'obj' references it
del obj
# Output: Destroyed heavy-data

So far so good. But what happens when you cache that object?

cache = {}
 
def get_data(key):
    if key not in cache:
        cache[key] = ExpensiveObject(key)
    return cache[key]
 
obj = get_data("user-123")
# Output: Created user-123
 
del obj  # We're done with it locally...
# But nothing prints! The object is still alive in the cache.

The cache holds a reference, so the object lives forever. This is often not what you want. If the object is only useful while something else is using it, the cache is just leaking memory.

Enter weakref.ref()

A weak reference refers to an object without keeping it alive. If the only remaining references are weak, the garbage collector can reclaim the object:

import weakref
 
class ExpensiveObject:
    def __init__(self, name):
        self.name = name
        print(f"Created {name}")
    
    def __del__(self):
        print(f"Destroyed {self.name}")
 
# Create an object and a weak reference to it
obj = ExpensiveObject("important")
# Output: Created important
 
weak = weakref.ref(obj)
 
# Access the object through the weak reference
print(weak())  # <ExpensiveObject object>
print(weak().name)  # "important"
 
# The strong reference goes away
del obj
# Output: Destroyed important
 
# Now the weak reference returns None
print(weak())  # None

The first time I saw this, I was confused. Why would you want a reference that might become None? It felt fragile.

What helped me understand: think of it like a lookup table that says "if this thing still exists, here it is." You're not claiming ownership—you're just keeping a way to find it.

WeakValueDictionary: Caches That Clean Themselves

This is where weakref really shines. A WeakValueDictionary holds weak references to its values. When nothing else references a value, it gets garbage collected and automatically removed from the dict:

import weakref
 
class UserSession:
    def __init__(self, user_id):
        self.user_id = user_id
        print(f"Session created for {user_id}")
    
    def __del__(self):
        print(f"Session cleaned up for {self.user_id}")
 
# A cache that won't hold onto sessions forever
session_cache = weakref.WeakValueDictionary()
 
def get_session(user_id):
    if user_id not in session_cache:
        session_cache[user_id] = UserSession(user_id)
    return session_cache[user_id]
 
# Get a session
session = get_session("user-42")
# Output: Session created for user-42
 
print(len(session_cache))  # 1
 
# User logs out, we drop our reference
del session
# Output: Session cleaned up for user-42
 
print(len(session_cache))  # 0 - automatically removed!

This blew my mind. The cache cleans itself. No manual invalidation, no TTL management, no "have we already cached too much?" checks. If nothing needs the object, it disappears.

WeakKeyDictionary: When You Want to Annotate Objects

WeakKeyDictionary is the flip side—it holds weak references to its keys. This is useful when you want to attach metadata to objects without preventing their cleanup:

import weakref
 
# Track extra metadata for objects without modifying them
metadata = weakref.WeakKeyDictionary()
 
class Request:
    def __init__(self, path):
        self.path = path
 
def process_request(req):
    # Attach timing info to the request
    metadata[req] = {"started_at": time.time()}
    # ... do work ...
 
# When the request is garbage collected,
# its metadata entry disappears too

I found this useful when I needed to track information about objects I didn't control (from a library) without subclassing or monkey-patching.

WeakSet: Observer Pattern Done Right

The observer pattern often creates accidental strong references. If your subject holds a list of observers, those observers can never be garbage collected while the subject exists:

# The naive approach - observers live forever
class Subject:
    def __init__(self):
        self.observers = []  # Strong references!
    
    def attach(self, observer):
        self.observers.append(observer)
    
    def notify(self):
        for obs in self.observers:
            obs.update()

With WeakSet, observers can come and go freely:

import weakref
 
class Subject:
    def __init__(self):
        self.observers = weakref.WeakSet()
    
    def attach(self, observer):
        self.observers.add(observer)
    
    def notify(self):
        # Dead observers automatically removed
        for obs in self.observers:
            obs.update()
 
class Observer:
    def __init__(self, name):
        self.name = name
    
    def update(self):
        print(f"{self.name} notified!")
    
    def __del__(self):
        print(f"{self.name} garbage collected")
 
subject = Subject()
 
obs1 = Observer("Observer-1")
obs2 = Observer("Observer-2")
 
subject.attach(obs1)
subject.attach(obs2)
 
print(f"Observers: {len(subject.observers)}")  # 2
 
del obs1
# Output: Observer-1 garbage collected
 
print(f"Observers: {len(subject.observers)}")  # 1

No need to explicitly unsubscribe. When the observer is no longer needed elsewhere, it gets cleaned up and removed from the set automatically.

Breaking Circular References

This was a tricky concept for me. Consider two objects that reference each other:

class Parent:
    def __init__(self):
        self.child = None
 
class Child:
    def __init__(self, parent):
        self.parent = parent  # Strong reference back to parent
 
parent = Parent()
child = Child(parent)
parent.child = child
 
# Now parent -> child -> parent
# Even after del parent, del child, they might stick around

Python's cycle detector usually handles this, but it adds overhead and can delay cleanup. Using a weak reference for the "back" link is cleaner:

import weakref
 
class Child:
    def __init__(self, parent):
        self.parent = weakref.ref(parent)  # Weak reference back
    
    def get_parent(self):
        # Might return None if parent was collected
        return self.parent()

When Does Garbage Collection Actually Happen?

This tripped me up. I expected objects to be collected immediately when the last strong reference disappeared. That's not guaranteed:

import weakref
import gc
 
class Thing:
    def __del__(self):
        print("Collected!")
 
thing = Thing()
weak = weakref.ref(thing)
 
del thing
# "Collected!" might not print immediately in all cases
 
# Force collection if you need it now
gc.collect()

In CPython (the standard Python), reference counting usually triggers immediate cleanup. But:

  • Circular references wait for the cycle collector
  • Other Python implementations (PyPy, Jython) have different GC timing
  • __del__ methods can have their own timing quirks

What I learned: Don't rely on when cleanup happens—just trust that it will happen.

Objects That Can't Be Weakly Referenced

Not everything supports weak references:

import weakref
 
# These work
weakref.ref([1, 2, 3])        # Error! list doesn't support weakref
weakref.ref((1, 2, 3))        # Error! tuple doesn't support weakref
weakref.ref("hello")          # Error! str doesn't support weakref
 
# These do work
class MyClass: pass
weakref.ref(MyClass())        # Works
weakref.ref(lambda: None)     # Works

Built-in types like list, dict, tuple, and str don't support weak references directly. For those, you'd need to subclass or wrap them.

Callbacks: Do Something When Objects Die

You can register a callback that runs when the referenced object is garbage collected:

import weakref
 
def on_finalize(ref):
    print(f"Object was garbage collected!")
 
class Resource:
    pass
 
obj = Resource()
weak = weakref.ref(obj, on_finalize)
 
del obj
# Output: Object was garbage collected!

This is useful for cleanup logic—closing files, releasing locks, logging for debugging.

Memory Implications: Real Numbers

Here's a quick comparison of memory behavior:

import weakref
import sys
 
class Data:
    def __init__(self):
        self.payload = "x" * 10000  # 10KB of data
 
# Strong reference dict
strong_cache = {}
for i in range(1000):
    strong_cache[i] = Data()
 
print(f"Strong cache entries: {len(strong_cache)}")  # 1000
# Memory: ~10MB held indefinitely
 
# Weak reference dict
weak_cache = weakref.WeakValueDictionary()
refs = []
for i in range(1000):
    obj = Data()
    weak_cache[i] = obj
    if i < 100:  # Only keep strong refs to first 100
        refs.append(obj)
 
import gc; gc.collect()
print(f"Weak cache entries: {len(weak_cache)}")  # ~100
# Memory: ~1MB (only objects with strong refs remain)

What I Learned

  1. Weak references are for "if it exists" relationships. The cache doesn't own the data—it just knows where to find it if someone else is using it.

  2. WeakValueDictionary is a self-cleaning cache. Entries disappear when nothing else needs them. No TTL logic, no max-size eviction, no manual invalidation.

  3. WeakSet is perfect for observers. Subscribers come and go without memory leaks or explicit unsubscribe calls.

  4. Circular references are cleaner with weak back-references. Child-to-parent links often don't need to keep the parent alive.

  5. Not everything supports weakref. Built-in types like list and str don't. Custom classes do by default.

  6. Don't depend on GC timing. Objects will be collected eventually. Don't write code that requires it to happen at a specific moment.

The memory leak I was debugging? It was a cache of parsed config objects. The fix was changing dict() to WeakValueDictionary(). The configs stayed cached while in use, then disappeared when their owners went away.

Sometimes the best fix is just asking Python to handle it.


Have you used weakref in production? I'm curious what patterns others have found useful.

React to this post: