Dataclasses remove the boilerplate from Python classes. Here's the complete guide.
Basic Usage
from dataclasses import dataclass
@dataclass
class User:
name: str
email: str
age: int
# Automatically generates __init__, __repr__, __eq__
user = User("Alice", "alice@example.com", 30)
print(user) # User(name='Alice', email='alice@example.com', age=30)No more writing __init__ methods that just assign arguments to self.
Default Values
@dataclass
class Config:
host: str = "localhost"
port: int = 8080
debug: bool = False
config = Config() # Uses all defaults
config = Config(port=3000) # Override specific fieldsImportant: Fields with defaults must come after fields without defaults.
Field Options
Use field() for more control:
from dataclasses import dataclass, field
@dataclass
class Article:
title: str
content: str
tags: list = field(default_factory=list) # Mutable default
views: int = field(default=0, repr=False) # Hide from repr
id: str = field(init=False) # Exclude from __init__
def __post_init__(self):
self.id = self.title.lower().replace(" ", "-")Key field() parameters:
default_factory: Callable for mutable defaults (lists, dicts)repr: Include in__repr__outputinit: Include in__init__parameterscompare: Include in equality comparisonshash: Include in__hash__
Immutable Dataclasses
Use frozen=True for immutable instances:
@dataclass(frozen=True)
class Point:
x: float
y: float
p = Point(1.0, 2.0)
p.x = 3.0 # Raises FrozenInstanceErrorFrozen dataclasses are hashable by default, making them usable as dict keys or set members.
Inheritance
Dataclasses support inheritance:
@dataclass
class Animal:
name: str
age: int
@dataclass
class Dog(Animal):
breed: str
dog = Dog("Max", 5, "Labrador")Gotcha: If the parent has default values, child fields must also have defaults:
@dataclass
class Base:
x: int = 0
@dataclass
class Child(Base):
y: int # Error! Must have default since x has one
y: int = 0 # This worksPost-Init Processing
Use __post_init__ for validation or computed fields:
@dataclass
class Rectangle:
width: float
height: float
area: float = field(init=False)
def __post_init__(self):
if self.width <= 0 or self.height <= 0:
raise ValueError("Dimensions must be positive")
self.area = self.width * self.heightConverting to Dict/Tuple
from dataclasses import asdict, astuple
@dataclass
class Person:
name: str
age: int
p = Person("Bob", 25)
asdict(p) # {'name': 'Bob', 'age': 25}
astuple(p) # ('Bob', 25)Slots for Memory Efficiency
Python 3.10+ supports slots=True:
@dataclass(slots=True)
class Compact:
x: int
y: intUses __slots__ instead of __dict__, reducing memory usage for many instances.
Comparison with Alternatives
NamedTuple:
from typing import NamedTuple
class Point(NamedTuple):
x: float
y: float
# Immutable, iterable, less flexibleattrs:
import attrs
@attrs.define
class User:
name: str
email: str = attrs.field(validator=attrs.validators.instance_of(str))
# More features, external dependencyWhen to use what:
- dataclass: Standard library, most use cases
- NamedTuple: Need tuple behavior, immutability
- attrs: Need validators, converters, or advanced features
Real-World Example
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional
@dataclass
class Task:
title: str
description: str
priority: int = 1
completed: bool = False
created_at: datetime = field(default_factory=datetime.now)
completed_at: Optional[datetime] = None
tags: list[str] = field(default_factory=list)
def complete(self):
self.completed = True
self.completed_at = datetime.now()
def __post_init__(self):
if not 1 <= self.priority <= 5:
raise ValueError("Priority must be 1-5")
# Usage
task = Task("Write docs", "Document the API", priority=2, tags=["docs"])
task.complete()Quick Reference
from dataclasses import dataclass, field, asdict, astuple
@dataclass(
frozen=False, # Immutable if True
order=False, # Generate __lt__, __le__, etc.
slots=False, # Use __slots__ (3.10+)
kw_only=False, # All fields keyword-only (3.10+)
)
class MyClass:
required: str # Required field
with_default: int = 10 # Default value
mutable_default: list = field(default_factory=list)
excluded: str = field(init=False) # Not in __init__
hidden: str = field(repr=False) # Not in __repr__Dataclasses strike the right balance between simplicity and power. Use them whenever you need a class that's primarily about storing data.