Dataclasses remove the boilerplate from Python classes. Here's the complete guide.

Basic Usage

from dataclasses import dataclass
 
@dataclass
class User:
    name: str
    email: str
    age: int
 
# Automatically generates __init__, __repr__, __eq__
user = User("Alice", "alice@example.com", 30)
print(user)  # User(name='Alice', email='alice@example.com', age=30)

No more writing __init__ methods that just assign arguments to self.

Default Values

@dataclass
class Config:
    host: str = "localhost"
    port: int = 8080
    debug: bool = False
 
config = Config()  # Uses all defaults
config = Config(port=3000)  # Override specific fields

Important: Fields with defaults must come after fields without defaults.

Field Options

Use field() for more control:

from dataclasses import dataclass, field
 
@dataclass
class Article:
    title: str
    content: str
    tags: list = field(default_factory=list)  # Mutable default
    views: int = field(default=0, repr=False)  # Hide from repr
    id: str = field(init=False)  # Exclude from __init__
 
    def __post_init__(self):
        self.id = self.title.lower().replace(" ", "-")

Key field() parameters:

  • default_factory: Callable for mutable defaults (lists, dicts)
  • repr: Include in __repr__ output
  • init: Include in __init__ parameters
  • compare: Include in equality comparisons
  • hash: Include in __hash__

Immutable Dataclasses

Use frozen=True for immutable instances:

@dataclass(frozen=True)
class Point:
    x: float
    y: float
 
p = Point(1.0, 2.0)
p.x = 3.0  # Raises FrozenInstanceError

Frozen dataclasses are hashable by default, making them usable as dict keys or set members.

Inheritance

Dataclasses support inheritance:

@dataclass
class Animal:
    name: str
    age: int
 
@dataclass
class Dog(Animal):
    breed: str
    
dog = Dog("Max", 5, "Labrador")

Gotcha: If the parent has default values, child fields must also have defaults:

@dataclass
class Base:
    x: int = 0
 
@dataclass
class Child(Base):
    y: int  # Error! Must have default since x has one
    y: int = 0  # This works

Post-Init Processing

Use __post_init__ for validation or computed fields:

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)
 
    def __post_init__(self):
        if self.width <= 0 or self.height <= 0:
            raise ValueError("Dimensions must be positive")
        self.area = self.width * self.height

Converting to Dict/Tuple

from dataclasses import asdict, astuple
 
@dataclass
class Person:
    name: str
    age: int
 
p = Person("Bob", 25)
asdict(p)   # {'name': 'Bob', 'age': 25}
astuple(p)  # ('Bob', 25)

Slots for Memory Efficiency

Python 3.10+ supports slots=True:

@dataclass(slots=True)
class Compact:
    x: int
    y: int

Uses __slots__ instead of __dict__, reducing memory usage for many instances.

Comparison with Alternatives

NamedTuple:

from typing import NamedTuple
 
class Point(NamedTuple):
    x: float
    y: float
 
# Immutable, iterable, less flexible

attrs:

import attrs
 
@attrs.define
class User:
    name: str
    email: str = attrs.field(validator=attrs.validators.instance_of(str))
 
# More features, external dependency

When to use what:

  • dataclass: Standard library, most use cases
  • NamedTuple: Need tuple behavior, immutability
  • attrs: Need validators, converters, or advanced features

Real-World Example

from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional
 
@dataclass
class Task:
    title: str
    description: str
    priority: int = 1
    completed: bool = False
    created_at: datetime = field(default_factory=datetime.now)
    completed_at: Optional[datetime] = None
    tags: list[str] = field(default_factory=list)
 
    def complete(self):
        self.completed = True
        self.completed_at = datetime.now()
 
    def __post_init__(self):
        if not 1 <= self.priority <= 5:
            raise ValueError("Priority must be 1-5")
 
# Usage
task = Task("Write docs", "Document the API", priority=2, tags=["docs"])
task.complete()

Quick Reference

from dataclasses import dataclass, field, asdict, astuple
 
@dataclass(
    frozen=False,    # Immutable if True
    order=False,     # Generate __lt__, __le__, etc.
    slots=False,     # Use __slots__ (3.10+)
    kw_only=False,   # All fields keyword-only (3.10+)
)
class MyClass:
    required: str                              # Required field
    with_default: int = 10                     # Default value
    mutable_default: list = field(default_factory=list)
    excluded: str = field(init=False)          # Not in __init__
    hidden: str = field(repr=False)            # Not in __repr__

Dataclasses strike the right balance between simplicity and power. Use them whenever you need a class that's primarily about storing data.

React to this post: