Why dataclasses fit domain objects
Domain objects are the small, focused types that represent concepts in your problem space: Money, EmailAddress, OrderLine, CustomerId, TimeRange, and so on. In Python, you can always model these with plain classes, but you quickly end up writing repetitive code: an __init__, comparisons, a readable __repr__, and sometimes defensive copying or immutability rules. dataclasses reduce that boilerplate while keeping you in “normal Python” (no framework required) and letting you express intent through field definitions and a small set of configuration flags.
For clean domain objects, the key idea is: use dataclasses to generate the mechanical parts (construction, representation, comparison), and write explicit domain logic for invariants and behaviors. A dataclass should not be “just a bag of fields”; it should be a type that protects its own validity and provides operations that make sense in the domain.
Choosing the right dataclass options
frozen: prefer immutability for value objects
Many domain types are value objects: their identity is defined by their attributes, and they are safe to share. For these, @dataclass(frozen=True) is a strong default. It prevents accidental mutation and makes instances hashable (when all fields are hashable), which is useful for using them as dictionary keys or set elements.
from dataclasses import dataclass@dataclass(frozen=True, slots=True)class Money: amount: int # store minor units (e.g., cents) currency: str def __post_init__(self) -> None: if self.amount < 0: raise ValueError("Money.amount must be non-negative") if len(self.currency) != 3 or not self.currency.isalpha(): raise ValueError("Money.currency must be a 3-letter code") object.__setattr__(self, "currency", self.currency.upper())With frozen=True, you cannot assign to fields normally, but you can still normalize values in __post_init__ via object.__setattr__. This is a common pattern for enforcing canonical forms (uppercasing currency codes, trimming whitespace, normalizing phone numbers).
slots: reduce accidental attributes and memory footprint
slots=True prevents adding new attributes at runtime and can reduce memory usage. It also helps keep domain objects “closed” to ad-hoc fields, which is often desirable for correctness.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
eq and order: define comparisons intentionally
Dataclasses generate __eq__ by default based on fields. For value objects, that is usually correct. For entities (objects with identity that can change over time), field-based equality can be wrong: two different entities might temporarily share the same attributes. In those cases, either implement equality yourself or model identity explicitly and compare by identity.
@dataclass(slots=True)class Customer: customer_id: str name: str email: str def __eq__(self, other: object) -> bool: if not isinstance(other, Customer): return NotImplemented return self.customer_id == other.customer_idAlternatively, keep the dataclass-generated equality but ensure the identity field is the only field participating in comparison by marking other fields with compare=False.
from dataclasses import dataclass, field@dataclass(slots=True)class Customer: customer_id: str name: str = field(compare=False) email: str = field(compare=False)Step-by-step: building a clean value object with invariants
This section walks through a practical pattern you can reuse: define fields, enforce invariants in __post_init__, and add domain methods that keep the object valid.
Step 1: start with a minimal dataclass
from dataclasses import dataclass@dataclass(frozen=True, slots=True)class EmailAddress: value: strAt this point, EmailAddress("not an email") is allowed. The type exists, but it does not protect itself.
Step 2: enforce invariants and normalize
Domain objects should be hard to misuse. Add validation and canonicalization in __post_init__. Keep it lightweight; you can use a simple rule set that matches your domain needs.
@dataclass(frozen=True, slots=True)class EmailAddress: value: str def __post_init__(self) -> None: v = self.value.strip() if "@" not in v: raise ValueError("Invalid email address") local, _, domain = v.partition("@") if not local or not domain or "." not in domain: raise ValueError("Invalid email address") object.__setattr__(self, "value", v.lower())Now the object guarantees a basic invariant: it always contains a normalized email string.
Step 3: add domain-friendly behavior
Instead of scattering string operations across the codebase, add methods that express domain intent.
@dataclass(frozen=True, slots=True)class EmailAddress: value: str def __post_init__(self) -> None: v = self.value.strip() if "@" not in v: raise ValueError("Invalid email address") local, _, domain = v.partition("@") if not local or not domain or "." not in domain: raise ValueError("Invalid email address") object.__setattr__(self, "value", v.lower()) @property def domain(self) -> str: return self.value.split("@", 1)[1] def is_corporate(self, corporate_domain: str) -> bool: return self.domain == corporate_domain.lower()The rest of the system can now depend on EmailAddress rather than raw strings, reducing repeated checks and edge cases.
Entities with dataclasses: identity, mutation, and invariants
Entities often change over time (status, address, assigned agent). For these, frozen=True may be inappropriate. You can still use dataclasses to generate the initializer and representation, while carefully controlling mutation through methods that enforce rules.
Encapsulate state changes through methods
A common anti-pattern is to expose mutable fields and let any caller assign to them. A cleaner approach is to keep fields “public” (Python does not enforce privacy) but treat them as internal and provide explicit methods for state transitions. You can also use properties if you want stricter control.
from dataclasses import dataclass, fieldfrom datetime import datetime@dataclass(slots=True)class Order: order_id: str created_at: datetime status: str = field(default="draft") paid_at: datetime | None = field(default=None) def mark_paid(self, when: datetime) -> None: if self.status == "cancelled": raise ValueError("Cannot pay a cancelled order") if self.paid_at is not None: raise ValueError("Order is already paid") if when < self.created_at: raise ValueError("paid_at cannot be earlier than created_at") self.status = "paid" self.paid_at = when def cancel(self) -> None: if self.status == "paid": raise ValueError("Cannot cancel a paid order") self.status = "cancelled"The dataclass provides a clean constructor and readable representation, while the methods ensure transitions remain valid.
Use __post_init__ to validate initial state
Even for mutable entities, validate that the object starts in a consistent state.
@dataclass(slots=True)class Order: order_id: str created_at: datetime status: str = field(default="draft") paid_at: datetime | None = field(default=None) def __post_init__(self) -> None: allowed = {"draft", "paid", "cancelled"} if self.status not in allowed: raise ValueError(f"Invalid status: {self.status}") if self.status == "paid" and self.paid_at is None: raise ValueError("paid_at must be set when status is paid") if self.status != "paid" and self.paid_at is not None: raise ValueError("paid_at must be None unless status is paid")Modeling collections safely: default factories and defensive design
Domain objects often contain collections: order lines, tags, applied discounts. In dataclasses, never use a mutable object as a default value directly. Use default_factory so each instance gets its own list/dict/set.
from dataclasses import dataclass, field@dataclass(slots=True)class Cart: cart_id: str item_skus: list[str] = field(default_factory=list)Beyond avoiding shared defaults, consider how callers can mutate collections. If you want stronger protection, expose tuples instead of lists, or provide methods that manage updates.
@dataclass(slots=True)class Cart: cart_id: str _item_skus: list[str] = field(default_factory=list, repr=False) def add(self, sku: str) -> None: if not sku: raise ValueError("sku must be non-empty") self._item_skus.append(sku) def remove(self, sku: str) -> None: self._item_skus.remove(sku) @property def item_skus(self) -> tuple[str, ...]: return tuple(self._item_skus)This pattern keeps mutation inside the class, where you can enforce invariants (no duplicates, maximum size, allowed SKUs) without relying on callers.
Composing domain objects: nested dataclasses
Dataclasses compose naturally: a domain object can contain other domain objects. This encourages small, reusable types and keeps validation close to the data it protects.
from dataclasses import dataclass@dataclass(frozen=True, slots=True)class Address: line1: str city: str postal_code: str country_code: str def __post_init__(self) -> None: if not self.line1.strip(): raise ValueError("Address.line1 is required") if not self.city.strip(): raise ValueError("Address.city is required") if len(self.country_code) != 2: raise ValueError("country_code must be 2 letters") object.__setattr__(self, "country_code", self.country_code.upper())@dataclass(slots=True)class Customer: customer_id: str name: str email: EmailAddress shipping_address: AddressNotice how Customer does not need to re-validate email or address rules; it relies on those types to be correct by construction.
Controlling what appears in repr and comparisons
Domain objects often contain sensitive or noisy fields (password hashes, tokens, large blobs). Dataclasses let you exclude fields from repr and from comparisons using field(repr=False) and field(compare=False).
from dataclasses import dataclass, field@dataclass(slots=True)class ApiCredentials: client_id: str client_secret: str = field(repr=False) def __post_init__(self) -> None: if not self.client_id: raise ValueError("client_id is required") if len(self.client_secret) < 16: raise ValueError("client_secret is too short")This keeps logs and debugging output safer by default.
Creating derived values: computed fields and caching
Sometimes a domain object needs a derived value (for example, a normalized key, a display label, or a precomputed total). With dataclasses, you can compute it in __post_init__ and store it in a field marked init=False. This is useful when the derived value is expensive or you want it to participate in equality/hashing in a controlled way.
from dataclasses import dataclass, field@dataclass(frozen=True, slots=True)class ProductCode: raw: str normalized: str = field(init=False) def __post_init__(self) -> None: r = self.raw.strip() if not r: raise ValueError("ProductCode.raw is required") n = r.upper().replace("-", "") object.__setattr__(self, "raw", r) object.__setattr__(self, "normalized", n)Because the object is frozen, the computed field is stable. You can choose whether equality should use raw, normalized, or both by setting compare flags appropriately.
Replacing values in immutable objects: dataclasses.replace
When using frozen dataclasses, you cannot mutate fields. Instead, create a new instance with a small change using dataclasses.replace. This is especially useful for value objects that evolve through transformations (apply discount, change time zone, adjust rounding) while keeping the original instance intact.
from dataclasses import dataclass, replace@dataclass(frozen=True, slots=True)class Percentage: value: int # 0..100 def __post_init__(self) -> None: if not (0 <= self.value <= 100): raise ValueError("Percentage must be between 0 and 100") def clamp(self, min_value: int, max_value: int) -> "Percentage": v = min(max(self.value, min_value), max_value) return replace(self, value=v)This keeps transformations explicit and testable: each method returns a new valid instance.
Interoperability: converting to dictionaries without leaking internals
Domain objects often need to cross boundaries: persistence, messaging, or API responses. Dataclasses provide dataclasses.asdict, but it recursively converts nested dataclasses and collections, which can be convenient but also too permissive (it may include internal fields you intended to hide, and it copies everything).
A practical approach for clean domain objects is to define explicit conversion methods that return exactly what you want to expose. This keeps boundary decisions out of the core dataclass mechanics.
from dataclasses import dataclass@dataclass(frozen=True, slots=True)class Address: line1: str city: str postal_code: str country_code: str def to_record(self) -> dict[str, str]: return { "line1": self.line1, "city": self.city, "postal_code": self.postal_code, "country_code": self.country_code, }For entities with internal mutable collections or private fields, explicit conversion methods are even more valuable because they prevent accidental leakage of internal state.
Testing domain objects built with dataclasses
Dataclasses make tests simpler because construction and equality are straightforward. Focus tests on invariants and behavior: invalid inputs raise errors, normalization happens, and domain methods enforce rules.
- Test that invalid construction fails (e.g., empty email, negative money).
- Test canonicalization (currency uppercased, whitespace trimmed).
- Test that state transitions are guarded (cannot pay twice, cannot cancel after payment).
- Test that equality matches domain meaning (value objects compare by value; entities compare by identity).
import pytestfrom datetime import datetime, timedeltadef test_money_currency_is_normalized(): m = Money(amount=100, currency="usd") assert m.currency == "USD"def test_order_cannot_be_paid_twice(): o = Order(order_id="o1", created_at=datetime.utcnow()) o.mark_paid(datetime.utcnow()) with pytest.raises(ValueError): o.mark_paid(datetime.utcnow() + timedelta(seconds=1))Practical checklist for clean dataclass-based domain objects
- Use
frozen=Truefor value objects; prefer immutability when possible. - Use
slots=Trueto prevent accidental attributes and reduce memory overhead. - Validate and normalize in
__post_init__; keep objects valid by construction. - For entities, define identity-based equality (custom
__eq__orcompare=Falseon non-identity fields). - Use
default_factoryfor mutable defaults; consider exposing immutable views (tuples) of internal collections. - Hide sensitive fields from
reprwithrepr=False. - Prefer explicit conversion methods (
to_record,to_dict) over blanketasdictwhen crossing boundaries. - Keep domain behavior in methods; avoid letting callers mutate fields directly in ways that bypass invariants.