What “Domain Data Modeling” Means in Practice
Domain data modeling is the act of defining the data structures that represent the business concepts your software cares about, along with the rules that make those structures valid. The “domain” is not your database schema and not your API payloads; it is the conceptual layer where you name things the way the business names them (Order, Invoice, Subscription, Shipment, Money, EmailAddress) and encode the invariants that must always hold (an Order must have at least one line item; a Money value must have a currency; a percentage must be between 0 and 1 or 0 and 100 depending on your convention).
In Python projects, domain models often end up being used by multiple layers: services, background jobs, web handlers, and persistence adapters. That reuse is powerful, but it also creates pressure to make the domain model satisfy everyone’s needs. This chapter focuses on goals (what your domain model should optimize for) and boundaries (what your domain model should not be responsible for).
Core Goals of a Domain Data Model
Goal 1: Capture Business Meaning, Not Technical Convenience
A domain model should make illegal states hard to represent and legal states easy to represent. That usually means preferring domain-specific types and structures over generic dictionaries and loosely typed primitives.
- Prefer
Money(amount, currency)over a barefloat. - Prefer
EmailAddressover a plainstrwhen email rules matter. - Prefer
OrderIdover anintwhen IDs must not be mixed across aggregates.
This goal is about clarity and correctness. If a developer can accidentally pass a customer_id where an order_id is expected, the model is not communicating meaning strongly enough.
Goal 2: Encode Invariants Close to the Data
Invariants are rules that must always be true for an object to be considered valid. The domain model is the best place to encode them because it is the shared representation used across the system. If invariants live only in a web endpoint or only in a database constraint, other code paths can bypass them.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
Examples of invariants:
- An
Ordermust have at least oneOrderLine. - A
Discountcannot be negative. - A
Shipmentcannot be created for an order that is not paid (depending on your business rules).
Encoding invariants does not necessarily mean “validate everything everywhere.” It means choosing a small set of critical rules that define what the object is, and ensuring those rules are enforced at creation and at state transitions.
Goal 3: Provide Stable Interfaces for the Rest of the System
A domain model should be stable even when external representations change. Your API may rename fields, your database may be normalized differently, and your message bus payloads may evolve. If your domain model is tightly coupled to those representations, every external change forces a cascade of internal changes.
Stability comes from designing domain objects around business language and behavior rather than around transport or storage formats.
Goal 4: Support Behavior and State Transitions (Not Just Data)
Many teams start with “data-only” models and later discover that business rules are scattered across services. A domain model can include methods that represent meaningful operations: order.add_line(...), order.pay(...), subscription.cancel(...). These methods become the single place where state transitions are validated and applied.
Even if you keep domain objects mostly immutable, you can still model behavior by returning new instances (functional style) or by using explicit command methods that mutate in controlled ways.
Goal 5: Make Testing and Reasoning Easy
A good domain model is easy to instantiate in tests, easy to validate, and easy to reason about. If creating an Order requires a database connection, an HTTP call, and a dozen optional fields, your model is not bounded correctly.
Testing-friendly models typically:
- Have explicit constructors or factory functions with minimal required inputs.
- Separate domain rules from infrastructure concerns.
- Use deterministic behavior (no hidden time or random dependencies unless injected).
Boundaries: What the Domain Model Should Not Own
Boundary 1: Transport Formats (API/JSON) Are Not the Domain
API payloads are optimized for clients and network concerns: backward compatibility, field naming conventions, optional fields, and partial updates. Domain objects are optimized for correctness and business meaning. Mixing these concerns leads to models that accept too many “maybe” values and become permissive in ways that violate invariants.
Practical boundary: keep a mapping layer between API schemas and domain objects. The mapping layer can handle optional fields, defaults, and versioning, then produce a valid domain object (or a structured error).
Boundary 2: Persistence Schemas Are Not the Domain
Database tables are optimized for querying, indexing, normalization/denormalization, and migration constraints. Domain objects are optimized for expressing business rules. If your domain model mirrors your tables one-to-one, you may end up with anemic models that leak persistence details (foreign keys, join tables, surrogate keys) into business logic.
Practical boundary: treat persistence as an adapter. Repositories (or data access layers) translate between domain objects and persistence records.
Boundary 3: Framework Concerns Should Not Leak In
Frameworks often impose base classes, decorators, or implicit behaviors. When those concerns leak into the domain model, you make the domain harder to reuse and test. For example, requiring a web framework request object to create a domain entity is a boundary violation.
Practical boundary: domain objects should be plain Python objects with minimal dependencies. Keep framework-specific code at the edges.
Boundary 4: Cross-Cutting Operational Concerns
Logging, metrics, tracing, caching, retries, and rate limiting are important, but they are not domain rules. If your domain object methods log directly or emit metrics, you couple business logic to operational tooling.
Practical boundary: emit domain events or return results that application services can observe and instrument.
Choosing the Right Modeling Scope: Entities, Value Objects, and Aggregates
Entities: Identity and Lifecycle
An entity is something with a stable identity over time. Its attributes may change, but it remains “the same thing.” Orders, customers, and subscriptions are typical entities. The key modeling question is: what makes this entity the same across time and across system boundaries? That answer becomes its identity.
Boundary implication: entities often have lifecycle rules (created, paid, shipped, canceled). Those transitions belong in the domain model, not scattered across endpoints.
Value Objects: Meaning by Value
A value object is defined by its attributes, not by identity. Two Money(10, 'USD') values are interchangeable. Value objects are excellent for encoding invariants and preventing primitive obsession.
Boundary implication: value objects should be small, immutable (when practical), and easy to validate.
Aggregates: Consistency Boundaries
An aggregate is a cluster of entities and value objects that must be consistent together. The aggregate root is the entry point for modifications. For example, an Order aggregate might include OrderLine items and a ShippingAddress value object. The rule is: if you change anything inside the aggregate, you do it through the root so invariants remain enforced.
Boundary implication: aggregates define transactional boundaries. They help you decide what must be updated atomically and what can be eventually consistent.
Step-by-Step: Define Modeling Goals and Boundaries for a Feature
The following process is a practical way to decide what belongs in your domain model and what belongs at the edges.
Step 1: Write the Business Vocabulary
Start by listing the nouns and verbs used by stakeholders. Keep it short and concrete.
- Nouns: Order, Line Item, Product, Price, Discount, Tax, Payment, Shipment, Address
- Verbs: Place order, Add item, Remove item, Apply discount, Pay, Ship, Cancel
This vocabulary becomes the naming source for your domain classes and methods.
Step 2: Identify Invariants and State Transitions
For each noun, ask: what must always be true? For each verb, ask: what conditions must hold before it can happen?
- Order must have at least one line item before it can be placed.
- Order cannot be paid twice.
- Discount cannot exceed subtotal.
- Shipment requires a shipping address.
These rules are candidates for domain-level enforcement.
Step 3: Decide the Aggregate Boundary
Pick the smallest cluster that must be consistent at all times. A common mistake is making aggregates too large (everything in one giant object) or too small (no object can enforce invariants).
Example decision:
Orderis an aggregate root.OrderLineis inside theOrderaggregate.Productis outside the aggregate (referenced byProductId), because product data changes independently and you may not want to lock orders when product descriptions change.
Step 4: Define Domain Types for High-Risk Primitives
Look for primitives that carry business meaning and are easy to misuse: IDs, money, percentages, timestamps, email/phone, country codes, quantities.
Promote them into value objects when:
- They have validation rules.
- They have formatting/parsing rules.
- They should not be mixed with similar primitives.
Step 5: Draw the Boundary Lines Explicitly
Write down what the domain model will not do. This is as important as what it will do.
- Domain objects will not parse raw JSON.
- Domain objects will not know database primary keys or ORM sessions.
- Domain objects will not call payment gateways.
- Domain objects will not log or emit metrics directly.
These statements guide architecture and prevent gradual erosion of boundaries.
Practical Example: A Small Order Domain with Explicit Boundaries
The code below illustrates a domain model that focuses on business meaning and invariants, while leaving transport/persistence concerns to other layers. The example uses plain Python with type hints and dataclasses to keep the domain lightweight.
from __future__ import annotations
from dataclasses import dataclass, field
from decimal import Decimal
from typing import List, NewType
OrderId = NewType("OrderId", str)
ProductId = NewType("ProductId", str)
@dataclass(frozen=True)
class Money:
amount: Decimal
currency: str
def __post_init__(self) -> None:
if self.amount < 0:
raise ValueError("Money amount cannot be negative")
if len(self.currency) != 3:
raise ValueError("Currency must be a 3-letter code")
def __add__(self, other: "Money") -> "Money":
if self.currency != other.currency:
raise ValueError("Cannot add Money with different currencies")
return Money(self.amount + other.amount, self.currency)
def __mul__(self, n: int) -> "Money":
if n < 0:
raise ValueError("Quantity cannot be negative")
return Money(self.amount * Decimal(n), self.currency)
@dataclass(frozen=True)
class OrderLine:
product_id: ProductId
unit_price: Money
quantity: int
def __post_init__(self) -> None:
if self.quantity <= 0:
raise ValueError("Quantity must be positive")
def line_total(self) -> Money:
return self.unit_price * self.quantity
@dataclass
class Order:
id: OrderId
currency: str
lines: List[OrderLine] = field(default_factory=list)
status: str = "draft" # draft, placed, paid, canceled
def add_line(self, product_id: ProductId, unit_price: Money, quantity: int) -> None:
if self.status != "draft":
raise ValueError("Can only modify lines while order is in draft")
if unit_price.currency != self.currency:
raise ValueError("Line currency must match order currency")
self.lines.append(OrderLine(product_id, unit_price, quantity))
def subtotal(self) -> Money:
if not self.lines:
return Money(Decimal("0"), self.currency)
total = Money(Decimal("0"), self.currency)
for line in self.lines:
total = total + line.line_total()
return total
def place(self) -> None:
if self.status != "draft":
raise ValueError("Only draft orders can be placed")
if not self.lines:
raise ValueError("Cannot place an order with no lines")
self.status = "placed"
def mark_paid(self) -> None:
if self.status != "placed":
raise ValueError("Only placed orders can be paid")
self.status = "paid"Notice what is intentionally missing:
- No JSON parsing or serialization.
- No database fields like
created_atorupdated_atunless they are domain-relevant. - No ORM base class.
- No payment gateway integration.
This is the boundary in action: the domain model enforces core rules (status transitions, currency consistency, non-empty order) and leaves everything else to adapters.
Where Validation Belongs: Domain vs Edge
Not all validation belongs in the domain model. A useful way to decide is to classify validation into three categories.
Category A: Domain Invariants (Belong in the Domain)
- Non-negative money amounts.
- Order must have at least one line to be placed.
- Cannot pay an order that is canceled.
These rules define what the object is. If they are violated, the object should not exist in that state.
Category B: Input Hygiene (Belongs at the Edge)
- Trimming whitespace from strings.
- Accepting multiple date formats from clients.
- Coercing numeric strings to numbers.
These are about dealing with messy inputs and compatibility. Keep them in API/request parsing layers so the domain stays strict and predictable.
Category C: Policy and Workflow Rules (Often Application Layer)
- Who is allowed to cancel an order (authorization).
- Rate limits for creating orders.
- Which payment provider to use.
These rules may change frequently and may depend on user context or external systems. They often fit better in application services that orchestrate domain objects.
Boundary Patterns for Python Projects
Pattern 1: DTOs at the Edge, Domain in the Middle
Use Data Transfer Objects (DTOs) for API inputs/outputs and keep domain objects separate. The DTO layer can be permissive and versioned; the domain layer stays strict.
from dataclasses import dataclass
from decimal import Decimal
from typing import List
@dataclass
class CreateOrderLineDTO:
product_id: str
unit_price: str # comes as string from JSON
quantity: int
@dataclass
class CreateOrderDTO:
order_id: str
currency: str
lines: List[CreateOrderLineDTO]
def dto_to_domain(dto: CreateOrderDTO) -> "Order":
order = Order(id=OrderId(dto.order_id), currency=dto.currency)
for line in dto.lines:
price = Money(Decimal(line.unit_price), dto.currency)
order.add_line(ProductId(line.product_id), price, line.quantity)
order.place()
return orderThe mapping function is a boundary: it converts messy/transport-oriented data into strict domain objects.
Pattern 2: Repositories as Persistence Boundaries
A repository hides how domain objects are stored. The domain does not know whether it is stored in PostgreSQL, Redis, or a file.
from typing import Protocol, Optional
class OrderRepository(Protocol):
def get(self, order_id: OrderId) -> Optional[Order]:
...
def save(self, order: Order) -> None:
...This boundary makes it possible to test domain logic without a database and to change persistence without rewriting domain code.
Pattern 3: Domain Events for Cross-Aggregate Effects
When something happens in the domain that other parts of the system care about (send an email, reserve inventory, create a shipment), consider emitting a domain event rather than calling external services directly from the domain model.
from dataclasses import dataclass
@dataclass(frozen=True)
class OrderPlaced:
order_id: OrderId
@dataclass
class Order:
id: OrderId
currency: str
lines: list[OrderLine] = field(default_factory=list)
status: str = "draft"
pending_events: list[object] = field(default_factory=list)
def place(self) -> None:
if self.status != "draft":
raise ValueError("Only draft orders can be placed")
if not self.lines:
raise ValueError("Cannot place an order with no lines")
self.status = "placed"
self.pending_events.append(OrderPlaced(self.id))The application layer can later publish pending_events to a message bus or handle them synchronously. The domain remains focused on business meaning and invariants.
Common Boundary Mistakes and How to Correct Them
Mistake: “One Model to Rule Them All”
Using the same class for API input, domain logic, and database persistence often leads to a model that is simultaneously too permissive (to accept partial inputs) and too rigid (to satisfy database constraints), while being awkward for business rules.
Correction: split models by responsibility. Keep a strict domain model, and create separate edge schemas for input/output and persistence mapping.
Mistake: Domain Objects That Depend on External Services
If order.pay() calls a payment gateway directly, you cannot test it without network access and you mix business rules with integration logic.
Correction: have order validate that it can be paid and produce an intent (state change and/or event). Let an application service call the gateway and then confirm payment by invoking a domain method like mark_paid().
Mistake: Over-Validating at Every Layer
Validating the same invariant in the API layer, service layer, and domain layer creates duplication and inconsistent error messages.
Correction: decide which rules are domain invariants and enforce them once in the domain. Keep edge validation for input hygiene and user-friendly error reporting, but avoid duplicating core invariants.
Mistake: Under-Specifying Types
When everything is a str or int, it becomes hard to see what values mean and easy to mix them up.
Correction: introduce small domain types for IDs and value objects for high-risk primitives. Even lightweight wrappers can dramatically reduce bugs.
How Goals and Boundaries Influence Your Choice of Modeling Tools
In Python, you can implement domain models with different tools depending on where the model sits and what it needs to do.
- For domain core: prefer lightweight, dependency-minimal classes (often dataclasses) that encode invariants and behavior.
- For edges (API, config, message payloads): prefer schema-focused models that handle parsing, coercion, and detailed error reporting.
- For persistence: prefer explicit mapping layers or repository adapters that translate between domain objects and storage representations.
The key is not which library you choose, but whether your model’s responsibilities match its layer. If a model is doing parsing, network concerns, and business invariants all at once, boundaries are blurred and maintenance cost rises.