Free Ebook cover Python Data Modeling in Practice: Dataclasses, Pydantic, and Type Hints

Python Data Modeling in Practice: Dataclasses, Pydantic, and Type Hints

New course

14 pages

Type Hints as an Executable Design Tool

Capítulo 2

Estimated reading time: 14 minutes

+ Exercise

Why treat type hints as “executable design”

In many Python codebases, type hints start as documentation: helpful for readers, but not essential to runtime behavior. In practice, type hints can do more: they can act as an executable design tool. “Executable” here means that your design decisions become checkable by tools (type checkers, linters, IDEs, test frameworks) and enforceable by runtime validation when you choose to connect hints to validators (for example via Pydantic or custom checks). The result is a feedback loop where the shape of your data and the rules of your APIs are continuously verified as you code.

Thinking this way changes how you write types. Instead of sprinkling annotations at the end, you use them to design boundaries between components, to communicate invariants, and to make illegal states harder to represent. When your types are precise, you can refactor more confidently, catch integration bugs earlier, and reduce the amount of defensive code you need in the “happy path”.

What type hints can and cannot guarantee

Type hints in Python are not enforced by the interpreter. They are primarily consumed by static analysis tools (e.g., mypy, pyright) and by IDEs. That means type hints can guarantee things only to the extent that your tooling checks them and your team treats type errors as actionable. They also cannot express every runtime constraint (e.g., “string must be a valid ISO date” or “list must be sorted”) without additional runtime checks.

Still, type hints are powerful because they can express a large portion of design intent: what kinds of values flow through functions, what is optional, what is immutable, what keys exist, and what operations are supported. When combined with runtime validation (selectively), they become a practical mechanism for aligning design intent with actual behavior.

Static vs runtime: a useful mental model

  • Static guarantees: “This function never returns None”, “This dict has these keys”, “This object supports this protocol”. Checked by type checkers.
  • Runtime guarantees: “This email is valid”, “This number is positive”, “This payload matches schema”. Enforced by validation libraries or explicit checks.
  • Design leverage: Use static types to make invalid wiring difficult; use runtime validation at boundaries (I/O, external systems) to protect the core.

Designing APIs with types first

Using type hints as a design tool often means writing function signatures before writing function bodies. The signature becomes a contract: it defines what the function expects and what it promises. If you keep signatures small and explicit, you can reason about composition and testability earlier.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

Step-by-step: start from a signature

Suppose you are implementing a function that takes raw input (already parsed into Python objects) and produces a normalized representation used by the rest of the system. Start by designing the types:

from __future__ import annotations

from dataclasses import dataclass
from datetime import datetime
from typing import Mapping, Any

@dataclass(frozen=True)
class NormalizedEvent:
    event_id: str
    occurred_at: datetime
    source: str
    payload: Mapping[str, Any]


def normalize_event(raw: Mapping[str, Any]) -> NormalizedEvent:
    ...

This signature already encodes design decisions: the output is immutable (frozen=True), the timestamp is a datetime (not a string), and the payload is a mapping of arbitrary JSON-like data. Before implementing anything, you can now write tests against the contract and let a type checker validate call sites.

Then implement with explicit conversions and minimal ambiguity:

from datetime import timezone


def normalize_event(raw: Mapping[str, Any]) -> NormalizedEvent:
    event_id = str(raw["id"])
    occurred_at = datetime.fromisoformat(raw["occurred_at"]).astimezone(timezone.utc)
    source = str(raw.get("source", "unknown"))
    payload = raw.get("payload", {})
    if not isinstance(payload, Mapping):
        raise TypeError("payload must be a mapping")
    return NormalizedEvent(
        event_id=event_id,
        occurred_at=occurred_at,
        source=source,
        payload=payload,
    )

The type hints guided the implementation: you converted to the target types and added a runtime check where static typing cannot help (the dynamic shape of raw).

Making illegal states harder to represent

A major benefit of “executable design” is that you can encode constraints in types so that certain invalid states become unrepresentable (or at least harder to create). In Python, you cannot fully prevent invalid runtime values, but you can structure your APIs so that most internal code cannot accidentally produce them.

Use Non-Optional types to force initialization

If a field is required, do not annotate it as optional. Optional types are contagious: once a value is Optional[T], every consumer must handle None or narrow the type. Reserve Optional for values that are truly absent in valid states.

from dataclasses import dataclass

@dataclass
class UserProfile:
    user_id: str
    display_name: str
    bio: str | None  # genuinely optional

Now any function that requires a display name can accept str and remain simple, while only bio-aware code deals with None.

Use Literal to constrain modes and flags

When a parameter accepts a small set of string values, Literal turns a vague “stringly typed” design into a checkable contract.

from typing import Literal

ExportFormat = Literal["csv", "json", "parquet"]


def export_data(format: ExportFormat) -> bytes:
    ...

Now typos like "jsno" are caught by static analysis and IDE autocomplete becomes accurate.

Use NewType or small wrappers for identifiers

When multiple IDs are all strings, it is easy to mix them up. NewType creates distinct static types without runtime overhead.

from typing import NewType

UserId = NewType("UserId", str)
OrderId = NewType("OrderId", str)


def load_user(user_id: UserId) -> dict:
    ...

def load_order(order_id: OrderId) -> dict:
    ...

At runtime these are still strings, but type checkers will flag passing an OrderId where a UserId is expected. This is a design win: you are encoding meaning, not just representation.

Protocols: design by behavior, not inheritance

Type hints can express “what an object can do” rather than “what it is”. Protocols (structural subtyping) let you define interfaces that any object can satisfy if it has the right methods/attributes. This is especially useful for dependency injection and testing: you can accept a protocol instead of a concrete class, making your design more flexible.

from typing import Protocol

class Clock(Protocol):
    def now(self) -> datetime:
        ...

class SystemClock:
    def now(self) -> datetime:
        return datetime.now()


def make_timestamped_message(clock: Clock, message: str) -> str:
    return f"{clock.now().isoformat()} {message}"

Any object with a now() -> datetime method can be used, including a fake clock in tests. The protocol is an executable design artifact: it is checked by tools and guides implementers.

Step-by-step: introduce a protocol to decouple a dependency

  • Identify a dependency that is hard to test (time, network, filesystem, random).
  • Extract the minimal behavior you need into a Protocol.
  • Update your function/class to accept the protocol type.
  • Provide a production implementation and a test double.
class FixedClock:
    def __init__(self, fixed: datetime) -> None:
        self._fixed = fixed

    def now(self) -> datetime:
        return self._fixed

With this, tests become deterministic without changing business logic.

Typed collections and generics to prevent shape drift

Many bugs come from “shape drift”: a list that sometimes contains strings and sometimes dicts, a mapping with inconsistent value types, or a function that returns different shapes depending on conditions. Generics and typed collections help you lock down these shapes.

Prefer Sequence/Mapping over list/dict in parameters

When you only need read-only behavior, accept Sequence[T] or Mapping[K, V]. This communicates intent and allows more inputs (tuples, other sequence types) while keeping the contract clear.

from typing import Sequence


def total_lengths(items: Sequence[str]) -> int:
    return sum(len(s) for s in items)

Use TypedDict for dict-like records

When you pass around dicts with known keys, TypedDict makes the design explicit and checkable without requiring a class.

from typing import TypedDict

class RawUser(TypedDict):
    id: str
    email: str
    is_active: bool


def send_welcome(user: RawUser) -> None:
    if user["is_active"]:
        ...

This reduces “mystery keys” and makes refactors safer: renaming a key becomes a type-checkable change.

Use type variables to preserve relationships

Sometimes the important design constraint is a relationship between input and output types. Type variables let you express that relationship so tools can verify it.

from typing import TypeVar, Iterable, Callable, list

T = TypeVar("T")


def first_match(items: Iterable[T], predicate: Callable[[T], bool]) -> T | None:
    for item in items:
        if predicate(item):
            return item
    return None

The function is generic: it works for any T, and the return type matches the input element type. This is executable design: the signature explains the behavior and prevents accidental loss of type information.

Type narrowing: designing control flow that proves safety

Type checkers can narrow types based on control flow. If you write code that makes checks explicit, you get safer code with less casting. This encourages a design where validation and branching are clear and localized.

Use guard clauses to narrow Optional

def format_bio(bio: str | None) -> str:
    if bio is None:
        return "(no bio)"
    # here bio is narrowed to str
    return bio.strip()

Use isinstance checks to narrow unions

from typing import Union

Value = Union[int, str]


def normalize_value(v: Value) -> str:
    if isinstance(v, int):
        return str(v)
    return v.strip()

When you design data as a union, design the control flow that handles each case explicitly. This makes the code readable and type-checkable.

User-defined type guards for domain-specific checks

Sometimes you need a custom predicate that narrows a type. TypeGuard lets you express that a function not only returns bool, but also proves a narrower type when true.

from typing import Any, TypeGuard


def is_str_list(x: Any) -> TypeGuard[list[str]]:
    return isinstance(x, list) and all(isinstance(i, str) for i in x)


def join_if_str_list(x: Any) -> str | None:
    if not is_str_list(x):
        return None
    # here x is list[str]
    return ",".join(x)

This is a strong example of “executable design”: your predicate becomes a reusable proof for the type checker and a runtime check for safety.

Connecting type hints to runtime validation intentionally

Type hints alone do not validate external input. If you treat type hints as design, you decide where runtime validation is needed and keep it near boundaries. Pydantic models are a common way to do this: they can parse and validate inputs while being type-annotated for downstream code. The key design move is to validate once at the boundary and then pass strongly typed objects inward.

Step-by-step: validate boundary input and keep the core typed

Imagine you receive a dict from an HTTP request or message queue. Validate it into a typed model, then operate on the typed model.

from datetime import datetime
from pydantic import BaseModel, Field

class EventIn(BaseModel):
    id: str
    occurred_at: datetime
    source: str = "unknown"
    payload: dict = Field(default_factory=dict)


def handle_event(data: dict) -> None:
    event = EventIn.model_validate(data)
    # from here on, event.occurred_at is a datetime, etc.
    process_event(event)


def process_event(event: EventIn) -> None:
    ...

The type hints on EventIn are now both static documentation and runtime-enforced constraints. This reduces scattered checks and makes the rest of the code simpler.

Be explicit about “trusted” vs “untrusted” data

A practical pattern is to name types to reflect trust level. For example, EventIn (untrusted input) becomes Event (validated internal representation). Even if both are Pydantic models or dataclasses, the naming signals the design boundary and helps reviewers spot missing validation.

Types as refactoring constraints

When you refactor, type hints can act like a safety harness. If you change a return type, rename a field, or alter a function contract, a type checker can point you to every affected call site. This is “executable design” in the maintenance phase: the design is not just written down, it is continuously checked.

Step-by-step: refactor with a type checker as a guide

  • Make the intended design change in the type signature first (e.g., change return type, rename a parameter, tighten a union).
  • Run the type checker to find all mismatches.
  • Update call sites and tests until the type checker is clean.
  • Only then consider removing temporary compatibility code.

Example: you decide a function should never return None. Update the signature and fix all call sites that assumed optionality.

def find_user_email(user_id: str) -> str:
    ...

If the implementation can fail, you now need a design decision: raise an exception, return a sentinel value, or return a result type. The type signature forces you to choose.

Result types: making failure explicit

Python often uses exceptions for failures, but sometimes you want failures to be explicit in the type system, especially when failures are expected and should be handled. You can model this with unions or small result dataclasses. The design goal is to prevent silent failure and to make handling mandatory at call sites.

Union-based result

from dataclasses import dataclass

@dataclass(frozen=True)
class Ok:
    value: int

@dataclass(frozen=True)
class Err:
    message: str

Result = Ok | Err


def parse_int(s: str) -> Result:
    try:
        return Ok(int(s))
    except ValueError:
        return Err(f"Not an int: {s!r}")


def use_value(text: str) -> int:
    result = parse_int(text)
    if isinstance(result, Err):
        raise ValueError(result.message)
    return result.value

Here, the type forces you to consider both cases. This is especially useful in pipelines where you want to accumulate errors rather than raise immediately.

Over-annotation vs design signal: choosing what to type

If you annotate everything indiscriminately, types become noise. As a design tool, type hints should highlight decisions that matter: public interfaces, boundaries, and tricky transformations. Internal local variables often do not need explicit annotations if they are inferred correctly, but sometimes adding an annotation clarifies intent or catches a subtle bug.

Prefer annotating interfaces

  • Function signatures (parameters and return types).
  • Public methods and properties.
  • Dataclass/Pydantic model fields.
  • Module-level constants.

Annotate locals when it prevents ambiguity

from typing import Any


def load_config(raw: dict[str, Any]) -> dict[str, str]:
    # Without this annotation, you might accidentally return non-str values.
    config: dict[str, str] = {}
    for k, v in raw.items():
        if isinstance(v, str):
            config[k] = v
    return config

The local annotation makes the intended output shape explicit and helps a type checker flag accidental assignments.

Designing with gradual typing in mind

Most real projects are partially typed. Treating type hints as executable design does not require perfection; it requires a strategy. Gradual typing works best when you prioritize high-leverage areas and tighten types over time.

Step-by-step: introduce types strategically

  • Start with the most-used modules and the most-called functions.
  • Annotate boundary layers (parsing, adapters, serialization) to reduce untyped data entering the core.
  • Replace Any with more precise types where it causes bugs or confusion.
  • Use protocols and typed dicts to avoid large refactors when you cannot introduce classes.

A practical rule is: allow Any at the edges when necessary, but do not let it leak inward. If a function returns Any, everything downstream becomes harder to reason about.

Common design pitfalls and how types reveal them

“Boolean trap” parameters

Functions that take multiple booleans are hard to read and easy to misuse. Types can push you toward clearer designs by replacing booleans with Literal modes or small enums.

from typing import Literal

Mode = Literal["strict", "lenient"]


def parse_record(text: str, mode: Mode) -> dict:
    ...

Overly broad unions

A union like str | int | float | dict often indicates unclear design. If you see this, consider whether you need separate functions, a protocol, or a normalized internal type. Type hints make this smell visible early.

Returning different shapes based on flags

If a function returns different types depending on a flag, call sites become complicated. Consider splitting into two functions or using overloads to make the design explicit.

from typing import overload, Literal

@overload
def get_value(key: str, *, default: None = None) -> str | None: ...

@overload
def get_value(key: str, *, default: str) -> str: ...


def get_value(key: str, *, default: str | None = None) -> str | None:
    ...

Overloads let the type checker understand the relationship between inputs and outputs, turning a subtle contract into an executable one.

Now answer the exercise about the content:

When treating type hints as an executable design tool, where should runtime validation be applied to best protect the core of the system?

You are right! Congratulations, now go to the next page

You missed! Try again.

Static types are checked by tools, not the interpreter. Runtime validation is most effective at boundaries (I/O, external payloads) to convert untrusted input into trusted, typed objects so the internal core can stay simpler and safer.

Next chapter

Dataclasses for Clean Domain Objects

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.