Both create structured data classes. Here's when to use each.

Quick Comparison

FeaturedataclassesPydantic
Built-in✗ (install required)
Validation
JSON parsingManualBuilt-in
PerformanceFasterSlower
ComplexitySimpleFeature-rich

Dataclasses

Built into Python 3.7+. Simple and fast.

from dataclasses import dataclass, field
 
@dataclass
class User:
    name: str
    email: str
    age: int | None = None
    tags: list[str] = field(default_factory=list)
 
# Create instances
user = User(name="Owen", email="owen@example.com")
 
# Access attributes
print(user.name)  # "Owen"
 
# Auto-generated methods
print(user)  # User(name='Owen', email='owen@example.com', age=None, tags=[])
user == User(name="Owen", email="owen@example.com")  # True

No validation

# Dataclasses don't validate
user = User(name=123, email=None, age="not a number")
# No error! Types are just hints

JSON serialization

from dataclasses import asdict
import json
 
user_dict = asdict(user)
json_str = json.dumps(user_dict)
 
# From JSON
data = json.loads(json_str)
user = User(**data)

Pydantic

External library. Validation and parsing built-in.

from pydantic import BaseModel, EmailStr
 
class User(BaseModel):
    name: str
    email: EmailStr
    age: int | None = None
    tags: list[str] = []
 
# Create with validation
user = User(name="Owen", email="owen@example.com")
 
# Validation errors
try:
    user = User(name=123, email="not-an-email")
except ValidationError as e:
    print(e)  # Detailed error messages

Type coercion

# Pydantic converts types when possible
user = User(name="Owen", email="owen@example.com", age="25")
print(user.age)  # 25 (int, not str)
print(type(user.age))  # <class 'int'>

JSON built-in

# To JSON
json_str = user.model_dump_json()
 
# From JSON
user = User.model_validate_json('{"name": "Owen", "email": "owen@example.com"}')
 
# From dict
user = User.model_validate({"name": "Owen", "email": "owen@example.com"})

Custom validation

from pydantic import field_validator
 
class User(BaseModel):
    name: str
    age: int
    
    @field_validator("age")
    @classmethod
    def age_must_be_positive(cls, v):
        if v < 0:
            raise ValueError("Age must be positive")
        return v

When to Use Each

Use dataclasses when:

  • Internal data structures
  • No external input
  • Performance matters
  • Want minimal dependencies
  • Simple data containers
@dataclass
class Point:
    x: float
    y: float
 
@dataclass
class Config:
    debug: bool
    log_level: str

Use Pydantic when:

  • Parsing API requests/responses
  • Reading config files
  • Validating user input
  • Need type coercion
  • Complex nested structures
class APIRequest(BaseModel):
    user_id: int
    action: Literal["create", "update", "delete"]
    data: dict[str, Any]
 
class Settings(BaseModel):
    database_url: str
    api_key: str
    debug: bool = False

Performance

Dataclasses are faster for creation and access:

# Dataclass: ~0.5 μs per instance
# Pydantic: ~5 μs per instance (with validation)

For hot paths with trusted data, dataclasses win. For untrusted input, Pydantic's validation is worth the cost.

Migration

From dataclass to Pydantic:

# Before
@dataclass
class User:
    name: str
    email: str
 
# After
class User(BaseModel):
    name: str
    email: str

Usually just change the decorator/base class. Add validators as needed.

My Rules

  1. Default to dataclasses for internal data
  2. Use Pydantic at boundaries (APIs, configs, files)
  3. Don't over-validate — trust internal data
  4. Pick one per layer — don't mix unnecessarily

Both are good tools. Use the right one for the job.

React to this post: