Why error modeling matters
Validation is not only about rejecting bad inputs; it is also about communicating what went wrong in a way that helps the caller fix it. In practice, the “caller” might be a REST client, a CLI user, a batch job, or another internal service. If your system returns vague messages like “invalid request” or throws raw exceptions, you create friction: debugging takes longer, support tickets increase, and clients implement brittle parsing of error strings.
Error modeling is the deliberate design of error types, error payloads, and validation feedback so that failures are predictable, machine-readable, and actionable. The goal is to make failures as well-structured as successes: consistent codes, stable fields, and clear mapping to the input that caused the problem.
This chapter focuses on designing error models and validation feedback in Python systems that use dataclasses, Pydantic, and type hints. We will not re-explain validation basics; instead, we will design the shape of errors, how to aggregate them, and how to present them across boundaries (HTTP, CLI, internal APIs) without leaking implementation details.
Principles for validation feedback
1) Errors should be structured, not stringly-typed
A human-readable message is useful, but it should not be the primary contract. Prefer stable identifiers (error codes) and structured fields (path, expected, actual, constraints) so clients can reliably react.
- Good: code="too_short", path=["user", "email"], min_length=5
- Risky: message="Email is too short" (clients start parsing text)
2) Errors should point to a location
When validating nested input, the consumer needs to know which field failed. Use a path representation that works for objects and arrays. A common approach is a list of segments, where each segment is a string key or integer index.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
- Example path: ["items", 2, "quantity"]
- Alternative: JSON Pointer string: "/items/2/quantity"
Pick one and standardize it. Lists are easy to build and transform; JSON Pointer is easy to display and aligns with RFC 6901.
3) Support multiple errors at once
Fail-fast is fine for internal invariants, but for user-facing input you often want to report all issues in one response. This reduces the “fix one error, resubmit, discover next error” loop.
Design your error model to hold a list of issues, not just one.
4) Separate developer diagnostics from user-facing messaging
Operational debugging needs stack traces, exception types, and internal context. Users need concise and safe messages. Your error payload can include both, but keep them separated and ensure sensitive details are not exposed outside trusted boundaries.
- User message: safe, actionable, no secrets
- Debug info: behind a feature flag, only in logs, or only in internal responses
5) Make error codes stable and versionable
Error codes become part of your API contract. Treat them like public identifiers: stable naming, documented meaning, and careful changes. If you must change semantics, introduce a new code rather than reusing an old one.
A practical error model for validation
Start with a small set of reusable types. The following example uses dataclasses to define a transport-agnostic error representation. You can serialize it to JSON for HTTP, print it for CLI, or attach it to logs.
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Any, Mapping, Sequence, Union
PathSegment = Union[str, int]
@dataclass(frozen=True)
class ValidationIssue:
code: str
message: str
path: Sequence[PathSegment] = field(default_factory=tuple)
details: Mapping[str, Any] = field(default_factory=dict)
@dataclass(frozen=True)
class ValidationErrorReport(Exception):
issues: Sequence[ValidationIssue]
def __str__(self) -> str:
# Keep __str__ readable for logs/CLI; do not rely on it as a contract.
return f"Validation failed with {len(self.issues)} issue(s)"Key design choices:
- ValidationIssue is the atomic unit: one problem at one location.
- code is stable and machine-readable.
- message is human-readable and can be localized later.
- details holds structured metadata (min/max, allowed values, pattern, etc.).
- ValidationErrorReport aggregates issues and can be raised as an exception in internal flows.
Designing a code system
Define a small vocabulary of codes and reuse them across fields and models. A typical set for input validation:
- required
- type_mismatch
- invalid_format
- too_short / too_long
- out_of_range
- not_allowed
- not_unique
- conflict
- invalid_state (cross-field rule)
Keep codes generic; put specifics in details. For example, instead of email_too_short, use too_short with details {"min_length": 6} and a path pointing to the email field.
Step-by-step: mapping validation failures to your error model
Step 1: Decide where validation happens and what it returns
Even if you validate in multiple places, you should normalize failures into a single representation at the boundary where you respond to callers. That normalization layer is where you convert framework-specific errors (Pydantic errors, custom exceptions, database constraint violations) into ValidationIssue objects.
Step 2: Normalize paths
Different sources represent paths differently. Pydantic uses tuples like ("items", 2, "quantity"). A JSON Schema validator might use JSON Pointer. Normalize them into your chosen format (here: list/tuple of segments).
Step 3: Normalize codes
Frameworks often have many granular error types. Map them into your stable code set. Preserve original information in details if useful.
Step 4: Provide safe messages
Messages should be actionable and avoid internal jargon. If you need localization, treat code + details as the source of truth and generate messages at the edge.
Example: converting Pydantic errors to a stable report
Pydantic exposes a structured list of errors. The exact shape differs slightly between major versions, but the idea is consistent: each error contains a location, a type, and a message. The adapter below focuses on the common fields and maps them into our model.
from typing import Any, Iterable
try:
# Pydantic v2
from pydantic import ValidationError as PydanticValidationError
except Exception: # pragma: no cover
PydanticValidationError = Exception # type: ignore
CODE_MAP = {
"missing": "required",
"string_too_short": "too_short",
"string_too_long": "too_long",
"int_parsing": "type_mismatch",
"float_parsing": "type_mismatch",
"value_error": "invalid_format",
}
def _normalize_loc(loc: Any) -> tuple[PathSegment, ...]:
if loc is None:
return ()
if isinstance(loc, (list, tuple)):
return tuple(loc)
return (str(loc),)
def pydantic_to_report(err: PydanticValidationError) -> ValidationErrorReport:
issues: list[ValidationIssue] = []
# v2: err.errors() returns list[dict]
for e in err.errors():
loc = _normalize_loc(e.get("loc"))
pyd_type = e.get("type", "value_error")
code = CODE_MAP.get(pyd_type, "invalid")
details = dict(e.get("ctx") or {})
details["pydantic_type"] = pyd_type
issues.append(
ValidationIssue(
code=code,
message=e.get("msg", "Invalid value"),
path=loc,
details=details,
)
)
return ValidationErrorReport(issues=tuple(issues))Notes:
- The mapping table is intentionally small. Expand it based on the error types you actually see.
- We store the original Pydantic type in details for debugging and analytics.
- We do not expose raw exception strings as the primary contract.
Designing HTTP error payloads
For an HTTP API, you typically want a top-level envelope with a stable shape. A common pattern:
error: a short category (e.g.,validation_error)issues: list of field-level issuesrequest_id: correlation id for support/logs
Example JSON shape (shown as Python dict for clarity):
def report_to_http_payload(report: ValidationErrorReport, request_id: str | None = None) -> dict:
return {
"error": "validation_error",
"request_id": request_id,
"issues": [
{
"code": i.code,
"message": i.message,
"path": list(i.path),
"details": dict(i.details),
}
for i in report.issues
],
}Design tips:
- Keep
issuesalways present (even if empty) to simplify clients. - Use consistent HTTP status codes (often 400 for syntactic/field validation, 422 for semantic validation, depending on your conventions).
- Do not include stack traces or internal exception names in the payload.
CLI and batch feedback design
For CLI tools, the same ValidationErrorReport can be rendered differently. The key is to keep it readable and to show paths clearly.
def format_path(path: Sequence[PathSegment]) -> str:
if not path:
return "<root>"
out = []
for seg in path:
if isinstance(seg, int):
out.append(f"[{seg}]")
else:
if out:
out.append(".")
out.append(seg)
return "".join(out)
def report_to_cli_text(report: ValidationErrorReport) -> str:
lines = []
for issue in report.issues:
lines.append(f"- {format_path(issue.path)}: {issue.code} ({issue.message})")
return "\n".join(lines)Batch jobs often need both human-readable output and machine-readable logs. You can log the JSON payload while printing a concise summary to stderr.
Cross-field and business-rule validation: modeling global issues
Some validations are not tied to a single field: “end_date must be after start_date”, “either email or phone is required”, “shipping address required when delivery_method=shipping”. These are still validation issues, but the path might be:
- the root path
[]to indicate a global issue, or - a synthetic path like ["__root__"], or
- multiple issues, one per involved field, plus a global summary
A practical approach is to emit:
- one global issue with code
invalid_stateand details listing involved fields, and - optionally field-level issues to highlight where the user should look
def date_order_issues(start: str | None, end: str | None) -> list[ValidationIssue]:
if start and end and end <= start:
return [
ValidationIssue(
code="invalid_state",
message="end_date must be after start_date",
path=(),
details={"fields": ["start_date", "end_date"]},
)
]
return []Handling collections: index-specific errors
When validating lists, index-specific paths are essential. If the third item has a problem, the client should be able to highlight exactly that row.
Example issue path: ["items", 2, "sku"]. This supports UI patterns like “scroll to row 3 and highlight SKU”.
When items can be reordered, consider including a stable identifier in details (e.g., {"item_id": "..."}) so clients can match errors even if indices change.
Errors from external systems: database constraints and uniqueness
Not all validation failures come from input parsing. Some are discovered when interacting with external systems:
- Unique constraint violation (email already exists)
- Foreign key violation (referenced object missing)
- Optimistic concurrency conflict
These should still be expressed in your stable error model. The key is to avoid leaking vendor-specific messages (e.g., raw SQL error text). Map them to codes like not_unique, not_found, or conflict, and attach safe details.
def unique_violation(field: str, value: str) -> ValidationIssue:
return ValidationIssue(
code="not_unique",
message=f"{field} must be unique",
path=(field,),
details={"value": value},
)Exception taxonomy: when to raise vs return
Internally, you may choose between returning a ValidationErrorReport (as a value) or raising it (as an exception). The design choice should be consistent per layer:
- At boundaries (HTTP handlers, CLI entrypoints): catch exceptions and convert to payloads/exit codes.
- In application services: raising can be convenient to abort flows and bubble up a report.
- In domain logic: prefer precise exceptions for invariants, then translate them into validation issues at the boundary if they are user-correctable.
A useful pattern is to treat “user-correctable” problems as validation issues, and “programmer/operational” problems as internal errors. If a failure is not actionable by the caller, it should not be presented as a validation issue.
Step-by-step: building a validation feedback pipeline
Step 1: Define your stable error schema
Choose fields and naming conventions. Decide on path format, code vocabulary, and whether details is allowed to contain arbitrary keys.
Step 2: Implement adapters from common sources
Create small functions that convert:
- Pydantic validation errors
- Custom domain exceptions that represent user-correctable problems
- Database constraint errors
Each adapter should output ValidationErrorReport or a list of ValidationIssue.
Step 3: Implement renderers for each channel
Renderers convert the stable report into:
- HTTP JSON payload + status code
- CLI text + exit code
- Structured logs (JSON)
Keep renderers dumb: no business logic, only formatting and policy (e.g., hide details in production).
Step 4: Add correlation and observability hooks
Include a request id in HTTP responses and logs. Consider adding an error_id or hashing the set of codes for analytics. Avoid logging raw user input if it may contain secrets; log paths and codes instead.
Designing for localization and UX
If you anticipate multiple languages or different UX surfaces, treat message as a presentation layer concern. Two common strategies:
- Server-generated messages: simplest; server returns localized messages based on request locale.
- Client-generated messages: server returns code + details; client maps to localized strings.
Even if you keep server-generated messages, still include stable codes so clients can implement behavior (e.g., highlight fields, show specific help) without parsing text.
Security and privacy considerations
Validation feedback can accidentally leak sensitive information. Common pitfalls:
- Echoing secrets back in error details (passwords, tokens)
- Revealing whether an account exists (“email already registered”) in contexts where that is sensitive
- Returning internal exception messages that include table names or stack traces
Mitigations:
- Redact sensitive fields before putting values into
details. - Use generic messages where necessary, while still providing a code (e.g.,
conflict). - Keep debug metadata in logs, not in public responses.
Testing error contracts
Error payloads are part of your API contract and deserve tests. Focus on stability:
- Codes do not change unexpectedly
- Paths are correct for nested structures and lists
- Multiple issues are returned when expected
- Details contain expected keys and do not contain sensitive values
def test_pydantic_error_mapping_has_stable_code_and_path():
# Pseudocode: create a model that fails, then map.
try:
raise Exception("simulate")
except Exception:
report = ValidationErrorReport(
issues=(
ValidationIssue(code="required", message="Field required", path=("email",)),
)
)
assert report.issues[0].code == "required"
assert list(report.issues[0].path) == ["email"]In real tests, generate actual Pydantic errors and run them through your adapter. Snapshot testing can be effective for full payloads, but be careful to avoid brittle snapshots that include variable messages. Prefer asserting on codes, paths, and key details.