What “invariants” mean in code
An invariant is a rule that must always be true for an object to be considered valid. Unlike “nice-to-have” validation (e.g., warning the user), an invariant is a hard constraint: if it is violated, the object should not exist in that state. Enforcing invariants early—at object creation time—prevents invalid states from leaking into the rest of your program, where they become harder to debug and more expensive to fix.
Typical invariants include: numeric ranges (quantity > 0), relationships between fields (start_date < end_date), normalization rules (emails lowercased, whitespace trimmed), and structural constraints (a list must not be empty, IDs must match a pattern). In Python, you can enforce invariants using post-initialization hooks (common with dataclasses) and validators (common with Pydantic). The key idea is the same: validate and normalize once, right after inputs are received, so all later code can rely on the object being consistent.
Post-initialization enforcement with dataclasses
When you use a dataclass, the generated __init__ assigns fields directly. That means you need a place to check relationships between fields and to normalize values after assignment. Dataclasses provide __post_init__ for exactly this purpose: it runs immediately after the auto-generated __init__ completes.
Pattern: validate, normalize, then freeze assumptions
A practical pattern for __post_init__ is:
- Validate each field’s basic constraints (range, emptiness, format).
- Validate cross-field constraints (e.g.,
min <= max). - Normalize values (strip whitespace, canonicalize case, convert types if you accept multiple input forms).
- Raise exceptions immediately on violations (commonly
ValueErrororTypeError).
Normalization is important: it reduces the number of representations of the “same” value. For example, if you store emails in lowercase and without surrounding whitespace, equality checks and dictionary keys behave predictably.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
Example: money-like value object with invariants
Suppose you want an amount that must be non-negative and have a currency code in ISO-like uppercase form. You also want to accept currency codes in mixed case but store them normalized.
from dataclasses import dataclass
@dataclass(frozen=True)
class Money:
amount: int # store as minor units (e.g., cents)
currency: str
def __post_init__(self) -> None:
if not isinstance(self.amount, int):
raise TypeError("amount must be int minor units")
if self.amount < 0:
raise ValueError("amount must be non-negative")
if not isinstance(self.currency, str):
raise TypeError("currency must be str")
cur = self.currency.strip().upper()
if len(cur) != 3 or not cur.isalpha():
raise ValueError("currency must be a 3-letter code")
# Because frozen=True, we must use object.__setattr__ to normalize
object.__setattr__(self, "currency", cur)Step-by-step what happens:
- The generated
__init__assignsamountandcurrency. __post_init__checks types and constraints.currencyis normalized to uppercase and stripped.- If any rule fails, object creation fails immediately.
Notice the use of frozen=True. Freezing is not required for invariants, but it helps maintain them by preventing later mutation. If you do freeze, normalization inside __post_init__ requires object.__setattr__.
Example: cross-field invariant (date range)
Cross-field invariants are where __post_init__ shines: you can only validate them after all fields are assigned.
from dataclasses import dataclass
from datetime import date
@dataclass(frozen=True)
class DateRange:
start: date
end: date
def __post_init__(self) -> None:
if self.start > self.end:
raise ValueError("start must be on or before end")This ensures that any DateRange instance is always logically ordered. Downstream code can safely assume start <= end without re-checking.
Example: derived fields and invariants
Sometimes you want to compute a derived field (like a normalized slug) and ensure it stays consistent with the source field. A derived field can be set in __post_init__ and treated as part of the invariant set.
from dataclasses import dataclass, field
import re
_slug_re = re.compile(r"[^a-z0-9]+")
@dataclass(frozen=True)
class ProductName:
raw: str
slug: str = field(init=False)
def __post_init__(self) -> None:
if not isinstance(self.raw, str):
raise TypeError("raw must be str")
cleaned = self.raw.strip()
if not cleaned:
raise ValueError("raw must not be empty")
slug = _slug_re.sub("-", cleaned.lower()).strip("-")
if not slug:
raise ValueError("raw must contain alphanumerics")
object.__setattr__(self, "raw", cleaned)
object.__setattr__(self, "slug", slug)Here, the invariant is: raw is non-empty after trimming, and slug is a deterministic normalization of raw. Because slug is computed, you avoid bugs where callers forget to compute it or compute it differently.
Practical checklist for dataclass invariants
- Prefer raising
ValueErrorfor invalid values andTypeErrorfor wrong types. - Normalize inputs in
__post_init__so the stored representation is canonical. - Use
frozen=Truewhen feasible to prevent invariant-breaking mutations. - Keep
__post_init__focused: validation and normalization only. If you need I/O or database checks, do them outside object construction.
Validators in Pydantic: enforcing invariants at the boundary
Pydantic models are commonly used where input data is messy: API payloads, environment variables, CLI arguments, and data read from files. Pydantic’s core strength is that it combines parsing (coercion) with validation. Instead of manually converting strings to numbers or dates and then validating, you declare rules and let the model enforce them.
Pydantic offers multiple validator styles (depending on version), but the invariant mindset is the same: define what “valid” means, and ensure invalid inputs fail fast with actionable error messages.
Field-level vs model-level validation
Two categories of invariants map naturally to two validator scopes:
- Field-level: constraints on a single field (e.g., non-empty string, positive int). These can often be expressed with built-in constrained types or field metadata, plus optional custom validators.
- Model-level: constraints involving multiple fields (e.g.,
start <= end, “ifis_giftthengift_messagemust be present”). These require a validator that sees the whole model.
Example: step-by-step with Pydantic v2 validators
The following example shows a booking request with several invariants:
guest_emailis normalized to lowercase and stripped.nightsmust be positive.check_inmust be beforecheck_out.
from datetime import date
from pydantic import BaseModel, field_validator, model_validator
class Booking(BaseModel):
guest_email: str
check_in: date
check_out: date
nights: int
@field_validator("guest_email")
@classmethod
def normalize_email(cls, v: str) -> str:
v2 = v.strip().lower()
if "@" not in v2:
raise ValueError("guest_email must contain '@'")
return v2
@field_validator("nights")
@classmethod
def nights_must_be_positive(cls, v: int) -> int:
if v <= 0:
raise ValueError("nights must be > 0")
return v
@model_validator(mode="after")
def check_dates(self):
if self.check_in >= self.check_out:
raise ValueError("check_in must be before check_out")
# Optional: ensure nights matches the date difference
expected = (self.check_out - self.check_in).days
if self.nights != expected:
raise ValueError(f"nights must equal {expected} for given dates")
return selfStep-by-step what happens when you create Booking:
- Pydantic parses inputs into the declared types (e.g., strings into
dateif possible). - Field validators run for their fields, allowing normalization and checks.
- The model validator runs after fields are set, enabling cross-field invariants.
- If any invariant fails, Pydantic raises a validation error containing structured details about which fields failed and why.
This is particularly useful at system boundaries: you can return precise error messages to API clients or log them for debugging.
Using built-in constraints to reduce custom code
Many invariants can be expressed without custom validators by using Pydantic’s constrained types and field metadata. This reduces boilerplate and makes rules more declarative.
from pydantic import BaseModel, Field
class Pagination(BaseModel):
page: int = Field(ge=1)
page_size: int = Field(ge=1, le=200)Here, the invariants are embedded in the field definitions: page must be at least 1, and page_size must be between 1 and 200. If you later change the limits, you do it in one place.
Example: conditional invariant (if A then B)
Conditional invariants are common in real payloads. For example: if a user selects a delivery method of "pickup", then address must be absent; if "delivery", then address must be present.
from typing import Literal, Optional
from pydantic import BaseModel, model_validator
class OrderFulfillment(BaseModel):
method: Literal["pickup", "delivery"]
address: Optional[str] = None
@model_validator(mode="after")
def enforce_method_rules(self):
if self.method == "delivery":
if not self.address or not self.address.strip():
raise ValueError("address is required for delivery")
else:
if self.address is not None:
raise ValueError("address must be omitted for pickup")
return selfThis validator ensures the model cannot exist in an ambiguous state. Downstream code can switch on method and safely assume the presence/absence of address.
Choosing where to enforce invariants: post-init vs validators
In practice, you often use both approaches in the same codebase, but at different layers:
- Dataclass post-init is ideal when you already have correctly typed values and want a lightweight, dependency-free way to guarantee object consistency. It’s also great for small “value objects” where normalization and cross-field checks are simple.
- Pydantic validators are ideal when you need parsing plus validation, especially when inputs may be strings, missing fields, or have extra fields. They shine at boundaries where you want rich error reporting.
A useful mental model is: Pydantic handles “turning messy input into clean Python values,” then your internal objects (dataclasses or other domain objects) rely on invariants to remain correct.
Step-by-step workflow: enforce invariants from input to internal objects
The following workflow avoids duplicated checks while keeping responsibilities clear:
Step 1: Validate and normalize incoming data with Pydantic
Use a Pydantic model to parse external input and enforce boundary-level invariants (required fields, basic formats, cross-field consistency that depends only on the payload).
from datetime import date
from pydantic import BaseModel, field_validator
class CreateDiscountPayload(BaseModel):
code: str
percent_off: int
starts_on: date
ends_on: date
@field_validator("code")
@classmethod
def normalize_code(cls, v: str) -> str:
v2 = v.strip().upper()
if not v2:
raise ValueError("code must not be empty")
return v2Step 2: Convert to an internal object that enforces deeper invariants
Then construct an internal object that enforces invariants that should always hold regardless of where the data came from (API, tests, batch jobs). This object can be a dataclass with __post_init__.
from dataclasses import dataclass
from datetime import date
@dataclass(frozen=True)
class Discount:
code: str
percent_off: int
starts_on: date
ends_on: date
def __post_init__(self) -> None:
if not (1 <= self.percent_off <= 90):
raise ValueError("percent_off must be between 1 and 90")
if self.starts_on >= self.ends_on:
raise ValueError("starts_on must be before ends_on")Step 3: Wire them together explicitly
def create_discount(payload_dict: dict) -> Discount:
payload = CreateDiscountPayload.model_validate(payload_dict)
return Discount(
code=payload.code,
percent_off=payload.percent_off,
starts_on=payload.starts_on,
ends_on=payload.ends_on,
)This gives you two layers of protection:
- Boundary validation with detailed error reporting (Pydantic).
- Internal invariant enforcement that remains true even if objects are created in other ways (dataclass post-init).
Common pitfalls and how to avoid them
Pitfall: validating too late
If you allow objects to be created in invalid states and validate only when you “use” them, bugs become time bombs. Prefer validating at construction time. Both __post_init__ and Pydantic validators support this by design.
Pitfall: mixing normalization with business actions
Keep invariants focused on correctness of state, not on performing actions. For example, don’t send emails, write files, or call external services inside validators or __post_init__. Those operations can fail for reasons unrelated to validity and will make object construction unpredictable.
Pitfall: inconsistent canonical forms
If you normalize in some places but not others, you can end up with multiple representations of the same concept (e.g., " USD " vs "usd"). Decide on canonical storage forms and enforce them consistently at the point of creation.
Pitfall: raising generic errors without context
For dataclasses, include clear messages in ValueError/TypeError so failures are actionable. For Pydantic, prefer validators that raise ValueError with specific messages; Pydantic will attach them to the correct field or model error location.
Testing invariants effectively
Invariants are easiest to test because they are deterministic: given inputs, object creation should either succeed with normalized values or fail with a specific error.
Dataclass invariant tests
import pytest
from datetime import date
def test_date_range_rejects_inverted_dates():
with pytest.raises(ValueError, match="start must be on or before end"):
DateRange(start=date(2026, 1, 10), end=date(2026, 1, 5))
def test_money_normalizes_currency():
m = Money(amount=100, currency=" usd ")
assert m.currency == "USD"Pydantic validator tests
import pytest
from pydantic import ValidationError
def test_booking_rejects_bad_email():
with pytest.raises(ValidationError):
Booking(guest_email="not-an-email", check_in="2026-01-01", check_out="2026-01-02", nights=1)
def test_booking_normalizes_email():
b = Booking(guest_email=" USER@Example.com ", check_in="2026-01-01", check_out="2026-01-03", nights=2)
assert b.guest_email == "user@example.com"These tests document your invariants in executable form. When requirements change, failing tests point directly to the rule that needs updating.