All courses > Technology and Programming > Programming Languages ( Python, Ruby, Java, C ) ::

Enforcing Invariants with Post-Initialization and Validators

Capítulo 4

Estimated reading time: 11 minutes

What “invariants” mean in code

An invariant is a rule that must always be true for an object to be considered valid. Unlike “nice-to-have” validation (e.g., warning the user), an invariant is a hard constraint: if it is violated, the object should not exist in that state. Enforcing invariants early—at object creation time—prevents invalid states from leaking into the rest of your program, where they become harder to debug and more expensive to fix.

Typical invariants include: numeric ranges (quantity > 0), relationships between fields (start_date < end_date), normalization rules (emails lowercased, whitespace trimmed), and structural constraints (a list must not be empty, IDs must match a pattern). In Python, you can enforce invariants using post-initialization hooks (common with dataclasses) and validators (common with Pydantic). The key idea is the same: validate and normalize once, right after inputs are received, so all later code can rely on the object being consistent.

Post-initialization enforcement with dataclasses

When you use a dataclass, the generated __init__ assigns fields directly. That means you need a place to check relationships between fields and to normalize values after assignment. Dataclasses provide __post_init__ for exactly this purpose: it runs immediately after the auto-generated __init__ completes.

Pattern: validate, normalize, then freeze assumptions

A practical pattern for __post_init__ is:

Validate each field’s basic constraints (range, emptiness, format).
Validate cross-field constraints (e.g., min <= max).
Normalize values (strip whitespace, canonicalize case, convert types if you accept multiple input forms).
Raise exceptions immediately on violations (commonly ValueError or TypeError).

Normalization is important: it reduces the number of representations of the “same” value. For example, if you store emails in lowercase and without surrounding whitespace, equality checks and dictionary keys behave predictably.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Example: money-like value object with invariants

Suppose you want an amount that must be non-negative and have a currency code in ISO-like uppercase form. You also want to accept currency codes in mixed case but store them normalized.

from dataclasses import dataclass

@dataclass(frozen=True)
class Money:
    amount: int  # store as minor units (e.g., cents)
    currency: str

    def __post_init__(self) -> None:
        if not isinstance(self.amount, int):
            raise TypeError("amount must be int minor units")
        if self.amount < 0:
            raise ValueError("amount must be non-negative")

        if not isinstance(self.currency, str):
            raise TypeError("currency must be str")
        cur = self.currency.strip().upper()
        if len(cur) != 3 or not cur.isalpha():
            raise ValueError("currency must be a 3-letter code")

        # Because frozen=True, we must use object.__setattr__ to normalize
        object.__setattr__(self, "currency", cur)

Step-by-step what happens:

The generated __init__ assigns amount and currency.
__post_init__ checks types and constraints.
currency is normalized to uppercase and stripped.
If any rule fails, object creation fails immediately.

Notice the use of frozen=True. Freezing is not required for invariants, but it helps maintain them by preventing later mutation. If you do freeze, normalization inside __post_init__ requires object.__setattr__.

Example: cross-field invariant (date range)

Cross-field invariants are where __post_init__ shines: you can only validate them after all fields are assigned.

from dataclasses import dataclass
from datetime import date

@dataclass(frozen=True)
class DateRange:
    start: date
    end: date

    def __post_init__(self) -> None:
        if self.start > self.end:
            raise ValueError("start must be on or before end")

This ensures that any DateRange instance is always logically ordered. Downstream code can safely assume start <= end without re-checking.

Example: derived fields and invariants

Sometimes you want to compute a derived field (like a normalized slug) and ensure it stays consistent with the source field. A derived field can be set in __post_init__ and treated as part of the invariant set.

from dataclasses import dataclass, field
import re

_slug_re = re.compile(r"[^a-z0-9]+")

@dataclass(frozen=True)
class ProductName:
    raw: str
    slug: str = field(init=False)

    def __post_init__(self) -> None:
        if not isinstance(self.raw, str):
            raise TypeError("raw must be str")
        cleaned = self.raw.strip()
        if not cleaned:
            raise ValueError("raw must not be empty")

        slug = _slug_re.sub("-", cleaned.lower()).strip("-")
        if not slug:
            raise ValueError("raw must contain alphanumerics")

        object.__setattr__(self, "raw", cleaned)
        object.__setattr__(self, "slug", slug)

Here, the invariant is: raw is non-empty after trimming, and slug is a deterministic normalization of raw. Because slug is computed, you avoid bugs where callers forget to compute it or compute it differently.

Practical checklist for dataclass invariants

Prefer raising ValueError for invalid values and TypeError for wrong types.
Normalize inputs in __post_init__ so the stored representation is canonical.
Use frozen=True when feasible to prevent invariant-breaking mutations.
Keep __post_init__ focused: validation and normalization only. If you need I/O or database checks, do them outside object construction.

Validators in Pydantic: enforcing invariants at the boundary

Pydantic models are commonly used where input data is messy: API payloads, environment variables, CLI arguments, and data read from files. Pydantic’s core strength is that it combines parsing (coercion) with validation. Instead of manually converting strings to numbers or dates and then validating, you declare rules and let the model enforce them.

Pydantic offers multiple validator styles (depending on version), but the invariant mindset is the same: define what “valid” means, and ensure invalid inputs fail fast with actionable error messages.

Field-level vs model-level validation

Two categories of invariants map naturally to two validator scopes:

Field-level: constraints on a single field (e.g., non-empty string, positive int). These can often be expressed with built-in constrained types or field metadata, plus optional custom validators.
Model-level: constraints involving multiple fields (e.g., start <= end, “if is_gift then gift_message must be present”). These require a validator that sees the whole model.

Example: step-by-step with Pydantic v2 validators

The following example shows a booking request with several invariants:

guest_email is normalized to lowercase and stripped.
nights must be positive.
check_in must be before check_out.

from datetime import date
from pydantic import BaseModel, field_validator, model_validator

class Booking(BaseModel):
    guest_email: str
    check_in: date
    check_out: date
    nights: int

    @field_validator("guest_email")
    @classmethod
    def normalize_email(cls, v: str) -> str:
        v2 = v.strip().lower()
        if "@" not in v2:
            raise ValueError("guest_email must contain '@'")
        return v2

    @field_validator("nights")
    @classmethod
    def nights_must_be_positive(cls, v: int) -> int:
        if v <= 0:
            raise ValueError("nights must be > 0")
        return v

    @model_validator(mode="after")
    def check_dates(self):
        if self.check_in >= self.check_out:
            raise ValueError("check_in must be before check_out")
        # Optional: ensure nights matches the date difference
        expected = (self.check_out - self.check_in).days
        if self.nights != expected:
            raise ValueError(f"nights must equal {expected} for given dates")
        return self

Step-by-step what happens when you create Booking:

Pydantic parses inputs into the declared types (e.g., strings into date if possible).
Field validators run for their fields, allowing normalization and checks.
The model validator runs after fields are set, enabling cross-field invariants.
If any invariant fails, Pydantic raises a validation error containing structured details about which fields failed and why.

This is particularly useful at system boundaries: you can return precise error messages to API clients or log them for debugging.

Using built-in constraints to reduce custom code

Many invariants can be expressed without custom validators by using Pydantic’s constrained types and field metadata. This reduces boilerplate and makes rules more declarative.

from pydantic import BaseModel, Field

class Pagination(BaseModel):
    page: int = Field(ge=1)
    page_size: int = Field(ge=1, le=200)

Here, the invariants are embedded in the field definitions: page must be at least 1, and page_size must be between 1 and 200. If you later change the limits, you do it in one place.

Example: conditional invariant (if A then B)

Conditional invariants are common in real payloads. For example: if a user selects a delivery method of "pickup", then address must be absent; if "delivery", then address must be present.

from typing import Literal, Optional
from pydantic import BaseModel, model_validator

class OrderFulfillment(BaseModel):
    method: Literal["pickup", "delivery"]
    address: Optional[str] = None

    @model_validator(mode="after")
    def enforce_method_rules(self):
        if self.method == "delivery":
            if not self.address or not self.address.strip():
                raise ValueError("address is required for delivery")
        else:
            if self.address is not None:
                raise ValueError("address must be omitted for pickup")
        return self

This validator ensures the model cannot exist in an ambiguous state. Downstream code can switch on method and safely assume the presence/absence of address.

Choosing where to enforce invariants: post-init vs validators

In practice, you often use both approaches in the same codebase, but at different layers:

Dataclass post-init is ideal when you already have correctly typed values and want a lightweight, dependency-free way to guarantee object consistency. It’s also great for small “value objects” where normalization and cross-field checks are simple.
Pydantic validators are ideal when you need parsing plus validation, especially when inputs may be strings, missing fields, or have extra fields. They shine at boundaries where you want rich error reporting.

A useful mental model is: Pydantic handles “turning messy input into clean Python values,” then your internal objects (dataclasses or other domain objects) rely on invariants to remain correct.

Step-by-step workflow: enforce invariants from input to internal objects

The following workflow avoids duplicated checks while keeping responsibilities clear:

Step 1: Validate and normalize incoming data with Pydantic

Use a Pydantic model to parse external input and enforce boundary-level invariants (required fields, basic formats, cross-field consistency that depends only on the payload).

from datetime import date
from pydantic import BaseModel, field_validator

class CreateDiscountPayload(BaseModel):
    code: str
    percent_off: int
    starts_on: date
    ends_on: date

    @field_validator("code")
    @classmethod
    def normalize_code(cls, v: str) -> str:
        v2 = v.strip().upper()
        if not v2:
            raise ValueError("code must not be empty")
        return v2

Step 2: Convert to an internal object that enforces deeper invariants

Then construct an internal object that enforces invariants that should always hold regardless of where the data came from (API, tests, batch jobs). This object can be a dataclass with __post_init__.

from dataclasses import dataclass
from datetime import date

@dataclass(frozen=True)
class Discount:
    code: str
    percent_off: int
    starts_on: date
    ends_on: date

    def __post_init__(self) -> None:
        if not (1 <= self.percent_off <= 90):
            raise ValueError("percent_off must be between 1 and 90")
        if self.starts_on >= self.ends_on:
            raise ValueError("starts_on must be before ends_on")

Step 3: Wire them together explicitly

def create_discount(payload_dict: dict) -> Discount:
    payload = CreateDiscountPayload.model_validate(payload_dict)
    return Discount(
        code=payload.code,
        percent_off=payload.percent_off,
        starts_on=payload.starts_on,
        ends_on=payload.ends_on,
    )

This gives you two layers of protection:

Boundary validation with detailed error reporting (Pydantic).
Internal invariant enforcement that remains true even if objects are created in other ways (dataclass post-init).

Common pitfalls and how to avoid them

Pitfall: validating too late

If you allow objects to be created in invalid states and validate only when you “use” them, bugs become time bombs. Prefer validating at construction time. Both __post_init__ and Pydantic validators support this by design.

Pitfall: mixing normalization with business actions

Keep invariants focused on correctness of state, not on performing actions. For example, don’t send emails, write files, or call external services inside validators or __post_init__. Those operations can fail for reasons unrelated to validity and will make object construction unpredictable.

Pitfall: inconsistent canonical forms

If you normalize in some places but not others, you can end up with multiple representations of the same concept (e.g., " USD " vs "usd"). Decide on canonical storage forms and enforce them consistently at the point of creation.

Pitfall: raising generic errors without context

For dataclasses, include clear messages in ValueError/TypeError so failures are actionable. For Pydantic, prefer validators that raise ValueError with specific messages; Pydantic will attach them to the correct field or model error location.

Testing invariants effectively

Invariants are easiest to test because they are deterministic: given inputs, object creation should either succeed with normalized values or fail with a specific error.

Dataclass invariant tests

import pytest
from datetime import date

def test_date_range_rejects_inverted_dates():
    with pytest.raises(ValueError, match="start must be on or before end"):
        DateRange(start=date(2026, 1, 10), end=date(2026, 1, 5))

def test_money_normalizes_currency():
    m = Money(amount=100, currency=" usd ")
    assert m.currency == "USD"

Pydantic validator tests

import pytest
from pydantic import ValidationError

def test_booking_rejects_bad_email():
    with pytest.raises(ValidationError):
        Booking(guest_email="not-an-email", check_in="2026-01-01", check_out="2026-01-02", nights=1)

def test_booking_normalizes_email():
    b = Booking(guest_email="  USER@Example.com ", check_in="2026-01-01", check_out="2026-01-03", nights=2)
    assert b.guest_email == "user@example.com"

These tests document your invariants in executable form. When requirements change, failing tests point directly to the rule that needs updating.

Now answer the exercise about the content:

Why is it recommended to enforce invariants during object creation using dataclass __post_init__ or Pydantic validators?

You are right! Congratulations, now go to the next page

You missed! Try again.

Invariants are hard constraints. Enforcing them at creation time validates and normalizes once, so later code can rely on consistent objects and errors appear immediately instead of leaking invalid state downstream.