Both create structured data classes. Here's when to use each.
Quick Comparison
| Feature | dataclasses | Pydantic |
|---|---|---|
| Built-in | ✓ | ✗ (install required) |
| Validation | ✗ | ✓ |
| JSON parsing | Manual | Built-in |
| Performance | Faster | Slower |
| Complexity | Simple | Feature-rich |
Dataclasses
Built into Python 3.7+. Simple and fast.
from dataclasses import dataclass, field
@dataclass
class User:
name: str
email: str
age: int | None = None
tags: list[str] = field(default_factory=list)
# Create instances
user = User(name="Owen", email="owen@example.com")
# Access attributes
print(user.name) # "Owen"
# Auto-generated methods
print(user) # User(name='Owen', email='owen@example.com', age=None, tags=[])
user == User(name="Owen", email="owen@example.com") # TrueNo validation
# Dataclasses don't validate
user = User(name=123, email=None, age="not a number")
# No error! Types are just hintsJSON serialization
from dataclasses import asdict
import json
user_dict = asdict(user)
json_str = json.dumps(user_dict)
# From JSON
data = json.loads(json_str)
user = User(**data)Pydantic
External library. Validation and parsing built-in.
from pydantic import BaseModel, EmailStr
class User(BaseModel):
name: str
email: EmailStr
age: int | None = None
tags: list[str] = []
# Create with validation
user = User(name="Owen", email="owen@example.com")
# Validation errors
try:
user = User(name=123, email="not-an-email")
except ValidationError as e:
print(e) # Detailed error messagesType coercion
# Pydantic converts types when possible
user = User(name="Owen", email="owen@example.com", age="25")
print(user.age) # 25 (int, not str)
print(type(user.age)) # <class 'int'>JSON built-in
# To JSON
json_str = user.model_dump_json()
# From JSON
user = User.model_validate_json('{"name": "Owen", "email": "owen@example.com"}')
# From dict
user = User.model_validate({"name": "Owen", "email": "owen@example.com"})Custom validation
from pydantic import field_validator
class User(BaseModel):
name: str
age: int
@field_validator("age")
@classmethod
def age_must_be_positive(cls, v):
if v < 0:
raise ValueError("Age must be positive")
return vWhen to Use Each
Use dataclasses when:
- Internal data structures
- No external input
- Performance matters
- Want minimal dependencies
- Simple data containers
@dataclass
class Point:
x: float
y: float
@dataclass
class Config:
debug: bool
log_level: strUse Pydantic when:
- Parsing API requests/responses
- Reading config files
- Validating user input
- Need type coercion
- Complex nested structures
class APIRequest(BaseModel):
user_id: int
action: Literal["create", "update", "delete"]
data: dict[str, Any]
class Settings(BaseModel):
database_url: str
api_key: str
debug: bool = FalsePerformance
Dataclasses are faster for creation and access:
# Dataclass: ~0.5 μs per instance
# Pydantic: ~5 μs per instance (with validation)For hot paths with trusted data, dataclasses win. For untrusted input, Pydantic's validation is worth the cost.
Migration
From dataclass to Pydantic:
# Before
@dataclass
class User:
name: str
email: str
# After
class User(BaseModel):
name: str
email: strUsually just change the decorator/base class. Add validators as needed.
My Rules
- Default to dataclasses for internal data
- Use Pydantic at boundaries (APIs, configs, files)
- Don't over-validate — trust internal data
- Pick one per layer — don't mix unnecessarily
Both are good tools. Use the right one for the job.
React to this post: