Last week I was reviewing some code I wrote a few months ago—a leaderboard for a game I was building. Every time a new score came in, I was inserting it and then calling sorted() on the whole list. For 100 scores, no problem. For 10,000 scores hitting every second, my CPU was crying.
Then a senior engineer pointed me to bisect. It felt like discovering a secret level in a video game.
The Problem with Linear Search
Here's what I was doing before:
def find_position(scores: list[int], new_score: int) -> int:
"""Find where new_score should go. O(n) operation."""
for i, score in enumerate(scores):
if new_score <= score:
return i
return len(scores)
def add_score(scores: list[int], new_score: int) -> None:
"""Insert score in sorted position. Still O(n)."""
pos = find_position(scores, new_score)
scores.insert(pos, new_score)For every insert, I was scanning through potentially thousands of scores. Linear search is O(n)—meaning if I double my data, I double my search time. Not great.
Enter bisect
The bisect module uses binary search, which is O(log n). Here's the magic: instead of scanning every element, it repeatedly cuts the search space in half.
import bisect
scores = [100, 200, 300, 400, 500]
# Where would 250 go?
pos = bisect.bisect(scores, 250)
print(pos) # 2 (between 200 and 300)For a list of 10,000 items, linear search might check 5,000 elements on average. Binary search? About 14 comparisons. That's the difference between O(n) and O(log n).
bisect_left vs bisect_right
This tripped me up at first. What happens when you search for a value that already exists?
import bisect
scores = [100, 200, 200, 200, 300]
# bisect_right (or just bisect): insert AFTER existing values
print(bisect.bisect_right(scores, 200)) # 4
# bisect_left: insert BEFORE existing values
print(bisect.bisect_left(scores, 200)) # 1The way I remember it:
bisect_left: "I want to be first among equals"bisect_right: "I want to be last among equals"
bisect() without a suffix is the same as bisect_right().
insort: Insert While Maintaining Order
Once you find the position, you need to insert. insort does both in one call:
import bisect
scores = [100, 200, 300, 400]
bisect.insort(scores, 250)
print(scores) # [100, 200, 250, 300, 400]
# insort_left puts duplicates at the start
# insort_right (or insort) puts them at the endThis is what replaced my sorted() call. Instead of re-sorting the whole list every time, I just insert in the right spot.
import bisect
class Leaderboard:
def __init__(self):
self.scores: list[tuple[int, str]] = [] # (score, name)
def add(self, name: str, score: int) -> None:
# Negative score for descending order
bisect.insort(self.scores, (-score, name))
def top_10(self) -> list[tuple[str, int]]:
return [(name, -score) for score, name in self.scores[:10]]
board = Leaderboard()
board.add("Alice", 1500)
board.add("Bob", 2000)
board.add("Charlie", 1800)
print(board.top_10())
# [('Bob', 2000), ('Charlie', 1800), ('Alice', 1500)]Custom Key Functions (Python 3.10+)
This was a game-changer. Before Python 3.10, working with custom sort orders was awkward. Now there's a key parameter:
import bisect
# List of (name, timestamp) tuples sorted by timestamp
events = [
("login", 1000),
("purchase", 2000),
("logout", 3000),
]
# Find where a timestamp of 1500 would go
pos = bisect.bisect(events, 1500, key=lambda x: x[1])
print(pos) # 1 (between login and purchase)
# Insert a new event at the right position
new_event = ("click", 2500)
bisect.insort(events, new_event, key=lambda x: x[1])
print(events)
# [('login', 1000), ('purchase', 2000), ('click', 2500), ('logout', 3000)]Before 3.10, you'd have to maintain a separate list of keys or wrap everything in comparison objects. Not fun.
The Grade Lookup Pattern
This example from the Python docs clicked everything into place for me:
import bisect
def letter_grade(score: float) -> str:
"""Convert numeric score to letter grade."""
breakpoints = [60, 70, 80, 90] # Thresholds
grades = "FDCBA" # Corresponding grades
i = bisect.bisect(breakpoints, score)
return grades[i]
# Test it out
print(letter_grade(55)) # F (below 60)
print(letter_grade(65)) # D (60-69)
print(letter_grade(75)) # C (70-79)
print(letter_grade(85)) # B (80-89)
print(letter_grade(95)) # A (90+)It's elegant because bisect returns the index where the score would be inserted, which maps directly to the grade index.
I've used this pattern for:
- Pricing tiers (amount → discount percentage)
- Risk levels (score → risk category)
- Performance buckets (latency → "fast"/"medium"/"slow")
Time-Based Data Patterns
This is where bisect really shines in real applications. When you have timestamped data and need to query ranges:
import bisect
from datetime import datetime
class TimeSeries:
def __init__(self):
self.timestamps: list[datetime] = []
self.values: list[float] = []
def add(self, timestamp: datetime, value: float) -> None:
pos = bisect.bisect_left(self.timestamps, timestamp)
self.timestamps.insert(pos, timestamp)
self.values.insert(pos, value)
def get_range(
self,
start: datetime,
end: datetime
) -> list[tuple[datetime, float]]:
"""Get all readings between start and end."""
start_idx = bisect.bisect_left(self.timestamps, start)
end_idx = bisect.bisect_right(self.timestamps, end)
return list(zip(
self.timestamps[start_idx:end_idx],
self.values[start_idx:end_idx]
))
def get_latest_before(self, timestamp: datetime) -> tuple[datetime, float] | None:
"""Get the most recent reading before a timestamp."""
idx = bisect.bisect_left(self.timestamps, timestamp)
if idx > 0:
return (self.timestamps[idx - 1], self.values[idx - 1])
return None
# Usage
ts = TimeSeries()
ts.add(datetime(2024, 1, 1, 10, 0), 100.0)
ts.add(datetime(2024, 1, 1, 11, 0), 105.0)
ts.add(datetime(2024, 1, 1, 12, 0), 103.0)
readings = ts.get_range(
datetime(2024, 1, 1, 10, 30),
datetime(2024, 1, 1, 12, 0)
)
print(readings) # Data from 11:00 and 12:00Measuring the Performance Difference
I wanted to see the actual numbers, so I benchmarked it:
import bisect
import random
import time
def benchmark(n: int) -> dict[str, float]:
"""Compare linear vs binary search for n items."""
data = sorted(random.sample(range(n * 10), n))
target = data[n // 2] # Search for middle element
# Linear search
start = time.perf_counter()
for _ in range(1000):
for i, val in enumerate(data):
if val == target:
break
linear_time = time.perf_counter() - start
# Binary search with bisect
start = time.perf_counter()
for _ in range(1000):
idx = bisect.bisect_left(data, target)
found = idx < len(data) and data[idx] == target
bisect_time = time.perf_counter() - start
return {
"n": n,
"linear_ms": linear_time,
"bisect_ms": bisect_time,
"speedup": linear_time / bisect_time
}
for n in [100, 1_000, 10_000, 100_000]:
result = benchmark(n)
print(f"n={n:,}: linear={result['linear_ms']:.3f}s, "
f"bisect={result['bisect_ms']:.3f}s, "
f"speedup={result['speedup']:.1f}x")On my machine:
| n | Linear | Bisect | Speedup |
|---|---|---|---|
| 100 | 0.003s | 0.001s | 3x |
| 1,000 | 0.025s | 0.001s | 25x |
| 10,000 | 0.240s | 0.001s | 240x |
| 100,000 | 2.400s | 0.001s | 2,400x |
The bisect time barely changes as n grows. That's the logarithmic advantage.
Quick Reference
Here's my cheat sheet:
import bisect
# Find insertion point (after duplicates)
bisect.bisect(arr, x) # Same as bisect_right
bisect.bisect_right(arr, x)
# Find insertion point (before duplicates)
bisect.bisect_left(arr, x)
# Insert while maintaining sorted order
bisect.insort(arr, x) # Same as insort_right
bisect.insort_right(arr, x) # After duplicates
bisect.insort_left(arr, x) # Before duplicates
# With custom key (Python 3.10+)
bisect.bisect(arr, x, key=lambda item: item.value)
bisect.insort(arr, x, key=lambda item: item.value)When to Reach for bisect
Use bisect when:
- You have sorted data and need fast lookups
- You're maintaining a sorted list with frequent insertions
- You need range queries on ordered data
- You're mapping values to categories (like grades)
Don't use bisect when:
- Your data isn't sorted (sort it first, or use a different structure)
- You need O(log n) inserts too (use
sortedcontainers.SortedListinstead) - You just need to check membership (use a
set)
What I Learned
The jump from O(n) to O(log n) isn't just academic. It's the difference between code that works in development and code that works in production. My leaderboard went from choking at 10K users to handling 100K without breaking a sweat.
The bisect module is one of those stdlib gems that doesn't get enough attention. It's small, focused, and solves a real problem. That's the kind of tool I want to reach for more often.