All courses > Technology and Programming > Programming Languages ( Python, Ruby, Java, C ) ::

Dataclasses for Clean Domain Objects

Capítulo 3

Estimated reading time: 12 minutes

Why dataclasses fit domain objects

Domain objects are the small, focused types that represent concepts in your problem space: Money, EmailAddress, OrderLine, CustomerId, TimeRange, and so on. In Python, you can always model these with plain classes, but you quickly end up writing repetitive code: an __init__, comparisons, a readable __repr__, and sometimes defensive copying or immutability rules. dataclasses reduce that boilerplate while keeping you in “normal Python” (no framework required) and letting you express intent through field definitions and a small set of configuration flags.

For clean domain objects, the key idea is: use dataclasses to generate the mechanical parts (construction, representation, comparison), and write explicit domain logic for invariants and behaviors. A dataclass should not be “just a bag of fields”; it should be a type that protects its own validity and provides operations that make sense in the domain.

Choosing the right dataclass options

`frozen`: prefer immutability for value objects

Many domain types are value objects: their identity is defined by their attributes, and they are safe to share. For these, @dataclass(frozen=True) is a strong default. It prevents accidental mutation and makes instances hashable (when all fields are hashable), which is useful for using them as dictionary keys or set elements.

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)

class Money:

    amount: int  # store minor units (e.g., cents)

    currency: str

    def __post_init__(self) -> None:

        if self.amount < 0:

            raise ValueError("Money.amount must be non-negative")

        if len(self.currency) != 3 or not self.currency.isalpha():

            raise ValueError("Money.currency must be a 3-letter code")

        object.__setattr__(self, "currency", self.currency.upper())

With frozen=True, you cannot assign to fields normally, but you can still normalize values in __post_init__ via object.__setattr__. This is a common pattern for enforcing canonical forms (uppercasing currency codes, trimming whitespace, normalizing phone numbers).

`slots`: reduce accidental attributes and memory footprint

slots=True prevents adding new attributes at runtime and can reduce memory usage. It also helps keep domain objects “closed” to ad-hoc fields, which is often desirable for correctness.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

`eq` and `order`: define comparisons intentionally

Dataclasses generate __eq__ by default based on fields. For value objects, that is usually correct. For entities (objects with identity that can change over time), field-based equality can be wrong: two different entities might temporarily share the same attributes. In those cases, either implement equality yourself or model identity explicitly and compare by identity.

@dataclass(slots=True)

class Customer:

    customer_id: str

    name: str

    email: str

    def __eq__(self, other: object) -> bool:

        if not isinstance(other, Customer):

            return NotImplemented

        return self.customer_id == other.customer_id

Alternatively, keep the dataclass-generated equality but ensure the identity field is the only field participating in comparison by marking other fields with compare=False.

from dataclasses import dataclass, field

@dataclass(slots=True)

class Customer:

    customer_id: str

    name: str = field(compare=False)

    email: str = field(compare=False)

Step-by-step: building a clean value object with invariants

This section walks through a practical pattern you can reuse: define fields, enforce invariants in __post_init__, and add domain methods that keep the object valid.

Step 1: start with a minimal dataclass

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)

class EmailAddress:

    value: str

At this point, EmailAddress("not an email") is allowed. The type exists, but it does not protect itself.

Step 2: enforce invariants and normalize

Domain objects should be hard to misuse. Add validation and canonicalization in __post_init__. Keep it lightweight; you can use a simple rule set that matches your domain needs.

@dataclass(frozen=True, slots=True)

class EmailAddress:

    value: str

    def __post_init__(self) -> None:

        v = self.value.strip()

        if "@" not in v:

            raise ValueError("Invalid email address")

        local, _, domain = v.partition("@")

        if not local or not domain or "." not in domain:

            raise ValueError("Invalid email address")

        object.__setattr__(self, "value", v.lower())

Now the object guarantees a basic invariant: it always contains a normalized email string.

Step 3: add domain-friendly behavior

Instead of scattering string operations across the codebase, add methods that express domain intent.

@dataclass(frozen=True, slots=True)

class EmailAddress:

    value: str

    def __post_init__(self) -> None:

        v = self.value.strip()

        if "@" not in v:

            raise ValueError("Invalid email address")

        local, _, domain = v.partition("@")

        if not local or not domain or "." not in domain:

            raise ValueError("Invalid email address")

        object.__setattr__(self, "value", v.lower())

    @property

    def domain(self) -> str:

        return self.value.split("@", 1)[1]

    def is_corporate(self, corporate_domain: str) -> bool:

        return self.domain == corporate_domain.lower()

The rest of the system can now depend on EmailAddress rather than raw strings, reducing repeated checks and edge cases.

Entities with dataclasses: identity, mutation, and invariants

Entities often change over time (status, address, assigned agent). For these, frozen=True may be inappropriate. You can still use dataclasses to generate the initializer and representation, while carefully controlling mutation through methods that enforce rules.

Encapsulate state changes through methods

A common anti-pattern is to expose mutable fields and let any caller assign to them. A cleaner approach is to keep fields “public” (Python does not enforce privacy) but treat them as internal and provide explicit methods for state transitions. You can also use properties if you want stricter control.

from dataclasses import dataclass, field

from datetime import datetime

@dataclass(slots=True)

class Order:

    order_id: str

    created_at: datetime

    status: str = field(default="draft")

    paid_at: datetime | None = field(default=None)

    def mark_paid(self, when: datetime) -> None:

        if self.status == "cancelled":

            raise ValueError("Cannot pay a cancelled order")

        if self.paid_at is not None:

            raise ValueError("Order is already paid")

        if when < self.created_at:

            raise ValueError("paid_at cannot be earlier than created_at")

        self.status = "paid"

        self.paid_at = when

    def cancel(self) -> None:

        if self.status == "paid":

            raise ValueError("Cannot cancel a paid order")

        self.status = "cancelled"

The dataclass provides a clean constructor and readable representation, while the methods ensure transitions remain valid.

Use `__post_init__` to validate initial state

Even for mutable entities, validate that the object starts in a consistent state.

@dataclass(slots=True)

class Order:

    order_id: str

    created_at: datetime

    status: str = field(default="draft")

    paid_at: datetime | None = field(default=None)

    def __post_init__(self) -> None:

        allowed = {"draft", "paid", "cancelled"}

        if self.status not in allowed:

            raise ValueError(f"Invalid status: {self.status}")

        if self.status == "paid" and self.paid_at is None:

            raise ValueError("paid_at must be set when status is paid")

        if self.status != "paid" and self.paid_at is not None:

            raise ValueError("paid_at must be None unless status is paid")

Modeling collections safely: default factories and defensive design

Domain objects often contain collections: order lines, tags, applied discounts. In dataclasses, never use a mutable object as a default value directly. Use default_factory so each instance gets its own list/dict/set.

from dataclasses import dataclass, field

@dataclass(slots=True)

class Cart:

    cart_id: str

    item_skus: list[str] = field(default_factory=list)

Beyond avoiding shared defaults, consider how callers can mutate collections. If you want stronger protection, expose tuples instead of lists, or provide methods that manage updates.

@dataclass(slots=True)

class Cart:

    cart_id: str

    _item_skus: list[str] = field(default_factory=list, repr=False)

    def add(self, sku: str) -> None:

        if not sku:

            raise ValueError("sku must be non-empty")

        self._item_skus.append(sku)

    def remove(self, sku: str) -> None:

        self._item_skus.remove(sku)

    @property

    def item_skus(self) -> tuple[str, ...]:

        return tuple(self._item_skus)

This pattern keeps mutation inside the class, where you can enforce invariants (no duplicates, maximum size, allowed SKUs) without relying on callers.

Composing domain objects: nested dataclasses

Dataclasses compose naturally: a domain object can contain other domain objects. This encourages small, reusable types and keeps validation close to the data it protects.

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)

class Address:

    line1: str

    city: str

    postal_code: str

    country_code: str

    def __post_init__(self) -> None:

        if not self.line1.strip():

            raise ValueError("Address.line1 is required")

        if not self.city.strip():

            raise ValueError("Address.city is required")

        if len(self.country_code) != 2:

            raise ValueError("country_code must be 2 letters")

        object.__setattr__(self, "country_code", self.country_code.upper())

@dataclass(slots=True)

class Customer:

    customer_id: str

    name: str

    email: EmailAddress

    shipping_address: Address

Notice how Customer does not need to re-validate email or address rules; it relies on those types to be correct by construction.

Controlling what appears in `repr` and comparisons

Domain objects often contain sensitive or noisy fields (password hashes, tokens, large blobs). Dataclasses let you exclude fields from repr and from comparisons using field(repr=False) and field(compare=False).

from dataclasses import dataclass, field

@dataclass(slots=True)

class ApiCredentials:

    client_id: str

    client_secret: str = field(repr=False)

    def __post_init__(self) -> None:

        if not self.client_id:

            raise ValueError("client_id is required")

        if len(self.client_secret) < 16:

            raise ValueError("client_secret is too short")

This keeps logs and debugging output safer by default.

Creating derived values: computed fields and caching

Sometimes a domain object needs a derived value (for example, a normalized key, a display label, or a precomputed total). With dataclasses, you can compute it in __post_init__ and store it in a field marked init=False. This is useful when the derived value is expensive or you want it to participate in equality/hashing in a controlled way.

from dataclasses import dataclass, field

@dataclass(frozen=True, slots=True)

class ProductCode:

    raw: str

    normalized: str = field(init=False)

    def __post_init__(self) -> None:

        r = self.raw.strip()

        if not r:

            raise ValueError("ProductCode.raw is required")

        n = r.upper().replace("-", "")

        object.__setattr__(self, "raw", r)

        object.__setattr__(self, "normalized", n)

Because the object is frozen, the computed field is stable. You can choose whether equality should use raw, normalized, or both by setting compare flags appropriately.

Replacing values in immutable objects: `dataclasses.replace`

When using frozen dataclasses, you cannot mutate fields. Instead, create a new instance with a small change using dataclasses.replace. This is especially useful for value objects that evolve through transformations (apply discount, change time zone, adjust rounding) while keeping the original instance intact.

from dataclasses import dataclass, replace

@dataclass(frozen=True, slots=True)

class Percentage:

    value: int  # 0..100

    def __post_init__(self) -> None:

        if not (0 <= self.value <= 100):

            raise ValueError("Percentage must be between 0 and 100")

    def clamp(self, min_value: int, max_value: int) -> "Percentage":

        v = min(max(self.value, min_value), max_value)

        return replace(self, value=v)

This keeps transformations explicit and testable: each method returns a new valid instance.

Interoperability: converting to dictionaries without leaking internals

Domain objects often need to cross boundaries: persistence, messaging, or API responses. Dataclasses provide dataclasses.asdict, but it recursively converts nested dataclasses and collections, which can be convenient but also too permissive (it may include internal fields you intended to hide, and it copies everything).

A practical approach for clean domain objects is to define explicit conversion methods that return exactly what you want to expose. This keeps boundary decisions out of the core dataclass mechanics.

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)

class Address:

    line1: str

    city: str

    postal_code: str

    country_code: str

    def to_record(self) -> dict[str, str]:

        return {

            "line1": self.line1,

            "city": self.city,

            "postal_code": self.postal_code,

            "country_code": self.country_code,

For entities with internal mutable collections or private fields, explicit conversion methods are even more valuable because they prevent accidental leakage of internal state.

Testing domain objects built with dataclasses

Dataclasses make tests simpler because construction and equality are straightforward. Focus tests on invariants and behavior: invalid inputs raise errors, normalization happens, and domain methods enforce rules.

Test that invalid construction fails (e.g., empty email, negative money).
Test canonicalization (currency uppercased, whitespace trimmed).
Test that state transitions are guarded (cannot pay twice, cannot cancel after payment).
Test that equality matches domain meaning (value objects compare by value; entities compare by identity).

import pytest

from datetime import datetime, timedelta

def test_money_currency_is_normalized():

    m = Money(amount=100, currency="usd")

    assert m.currency == "USD"

def test_order_cannot_be_paid_twice():

    o = Order(order_id="o1", created_at=datetime.utcnow())

    o.mark_paid(datetime.utcnow())

    with pytest.raises(ValueError):

        o.mark_paid(datetime.utcnow() + timedelta(seconds=1))

Practical checklist for clean dataclass-based domain objects

Use frozen=True for value objects; prefer immutability when possible.
Use slots=True to prevent accidental attributes and reduce memory overhead.
Validate and normalize in __post_init__; keep objects valid by construction.
For entities, define identity-based equality (custom __eq__ or compare=False on non-identity fields).
Use default_factory for mutable defaults; consider exposing immutable views (tuples) of internal collections.
Hide sensitive fields from repr with repr=False.
Prefer explicit conversion methods (to_record, to_dict) over blanket asdict when crossing boundaries.
Keep domain behavior in methods; avoid letting callers mutate fields directly in ways that bypass invariants.

Now answer the exercise about the content:

When designing a dataclass-based domain object, which approach best helps keep the object valid and hard to misuse?

You are right! Congratulations, now go to the next page

You missed! Try again.

Clean domain objects should protect their own validity. Use dataclasses for construction and representation, then validate and normalize in __post_init__ and add methods that enforce domain rules and safe state transitions.

Next chapter

Enforcing Invariants with Post-Initialization and Validators

21%

Python Data Modeling in Practice: Dataclasses, Pydantic, and Type Hints

New course

14 pages

Dataclasses for Clean Domain Objects

Why dataclasses fit domain objects

Choosing the right dataclass options

`frozen`: prefer immutability for value objects

`slots`: reduce accidental attributes and memory footprint

`eq` and `order`: define comparisons intentionally

Step-by-step: building a clean value object with invariants

Step 1: start with a minimal dataclass

Step 2: enforce invariants and normalize

Step 3: add domain-friendly behavior

Entities with dataclasses: identity, mutation, and invariants

Encapsulate state changes through methods

Use `__post_init__` to validate initial state

Modeling collections safely: default factories and defensive design

Composing domain objects: nested dataclasses

Controlling what appears in `repr` and comparisons

Creating derived values: computed fields and caching

Replacing values in immutable objects: `dataclasses.replace`

Interoperability: converting to dictionaries without leaking internals

Testing domain objects built with dataclasses

Practical checklist for clean dataclass-based domain objects

When designing a dataclass-based domain object, which approach best helps keep the object valid and hard to misuse?

Python Data Modeling in Practice: Dataclasses, Pydantic, and Type Hints

LearnProgramming Languages ( Python, Ruby, Java, C )

LearnTechnology and Programming

Dataclasses for Clean Domain Objects

Why dataclasses fit domain objects

Choosing the right dataclass options

frozen: prefer immutability for value objects

slots: reduce accidental attributes and memory footprint

eq and order: define comparisons intentionally

Step-by-step: building a clean value object with invariants

Step 1: start with a minimal dataclass

Step 2: enforce invariants and normalize

Step 3: add domain-friendly behavior

Entities with dataclasses: identity, mutation, and invariants

Encapsulate state changes through methods

Use __post_init__ to validate initial state

Modeling collections safely: default factories and defensive design

Composing domain objects: nested dataclasses

Controlling what appears in repr and comparisons

Creating derived values: computed fields and caching

Replacing values in immutable objects: dataclasses.replace

Interoperability: converting to dictionaries without leaking internals

Testing domain objects built with dataclasses

Practical checklist for clean dataclass-based domain objects

When designing a dataclass-based domain object, which approach best helps keep the object valid and hard to misuse?

Python Data Modeling in Practice: Dataclasses, Pydantic, and Type Hints

LearnProgramming Languages ( Python, Ruby, Java, C )

LearnTechnology and Programming

`frozen`: prefer immutability for value objects

`slots`: reduce accidental attributes and memory footprint

`eq` and `order`: define comparisons intentionally

Use `__post_init__` to validate initial state

Controlling what appears in `repr` and comparisons

Replacing values in immutable objects: `dataclasses.replace`