Why validate and serialize?
Robust APIs treat all external input as untrusted. Validation ensures inputs match expected types and constraints (e.g., integers in range, strings in a format), while serialization ensures outputs are consistently shaped and safe (e.g., datetimes formatted, internal fields excluded). A schema approach centralizes these rules so every endpoint behaves consistently.
In this chapter, we will use a Pydantic-style pattern because it provides strong typing, defaults, nested validation, and clear error details. The same concepts map directly to Marshmallow (fields, load/dump, unknown handling), but the examples below focus on one coherent approach.
Install and choose a schema layer
Add Pydantic to your service:
pip install pydanticWe will implement three building blocks:
- Input schemas for query params and JSON bodies
- Output schemas for response serialization
- A validation helper that parses safely and emits consistent validation errors
Schema patterns: types, formats, ranges, required fields
Define reusable schemas
Create a module (e.g., schemas.py) with request/response models. Use strict types to avoid surprising coercions (e.g., rejecting "123" when an integer is required).
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
from __future__ import annotations
from datetime import datetime
from typing import Any, Dict, List, Optional
from pydantic import BaseModel, ConfigDict, Field, EmailStr, StrictInt, StrictStr
from pydantic import conint, constr
# Common base: forbid unknown fields by default
class Schema(BaseModel):
model_config = ConfigDict(extra="forbid")
class AddressIn(Schema):
line1: StrictStr = Field(min_length=1, max_length=200)
city: StrictStr = Field(min_length=1, max_length=80)
postal_code: constr(pattern=r"^[0-9A-Za-z\- ]{3,12}$")
class UserCreateIn(Schema):
email: EmailStr
name: StrictStr = Field(min_length=1, max_length=80)
age: conint(ge=13, le=120)
address: AddressIn
class UserUpdateIn(Schema):
# For full updates (PUT): still required fields
name: StrictStr = Field(min_length=1, max_length=80)
age: conint(ge=13, le=120)
address: AddressIn
class UserPatchIn(Schema):
# For partial updates (PATCH): all optional
model_config = ConfigDict(extra="forbid")
name: Optional[StrictStr] = Field(default=None, min_length=1, max_length=80)
age: Optional[conint(ge=13, le=120)] = None
address: Optional[AddressIn] = None
class UserOut(Schema):
id: StrictInt
email: EmailStr
name: StrictStr
age: StrictInt
created_at: datetime
address: Dict[str, Any]
# Example: allow output to be built from ORM objects/dicts
model_config = ConfigDict(from_attributes=True)
class UserListQuery(Schema):
# Query params arrive as strings; we can still parse them, but be explicit.
# If you want strict rejection of "1" for int, keep StrictInt and convert manually.
page: conint(ge=1, le=1000) = 1
per_page: conint(ge=1, le=100) = 20
sort: constr(pattern=r"^(created_at|email|name)$") = "created_at"
order: constr(pattern=r"^(asc|desc)$") = "asc"What this gives you:
- Required fields: fields without defaults are required (e.g.,
emailinUserCreateIn). - Types:
EmailStr,StrictStr,StrictInt. - Ranges:
conint(ge=13, le=120). - Formats: regex via
constr(pattern=...). - Unknown fields:
extra="forbid"rejects unexpected keys. - Defaults: query params default to
page=1,per_page=20, etc.
Consistent validation error responses
Rather than letting schema exceptions bubble up in different shapes, normalize them into one predictable payload. The error-handling chapter covered general error response patterns; here we focus on the validation-specific payload structure and how to produce it from schema errors.
A small validation helper
Create a helper that validates JSON bodies and query params and raises a single custom exception your existing error layer can render consistently.
from typing import Any, Dict, Optional, Type, TypeVar
from pydantic import BaseModel, ValidationError
T = TypeVar("T", bound=BaseModel)
class RequestValidationError(Exception):
def __init__(self, *, location: str, errors: list[dict[str, Any]]):
self.location = location # "query" or "body"
self.errors = errors
super().__init__("Request validation failed")
def _normalize_pydantic_errors(err: ValidationError) -> list[dict[str, Any]]:
normalized: list[dict[str, Any]] = []
for e in err.errors():
# e example: {"type": "missing", "loc": ("email",), "msg": "Field required", "input": {...}}
loc = e.get("loc", ())
normalized.append(
{
"path": "/" + "/".join(str(p) for p in loc),
"code": e.get("type"),
"message": e.get("msg"),
}
)
return normalized
def parse_json(model: Type[T], payload: Any) -> T:
try:
return model.model_validate(payload)
except ValidationError as e:
raise RequestValidationError(location="body", errors=_normalize_pydantic_errors(e))
def parse_query(model: Type[T], args: Dict[str, Any]) -> T:
# Flask's request.args is a MultiDict; convert to a plain dict.
# If you need repeated params, handle getlist() explicitly.
try:
return model.model_validate(dict(args))
except ValidationError as e:
raise RequestValidationError(location="query", errors=_normalize_pydantic_errors(e))Example error payload
When validation fails, return a consistent structure such as:
{
"error": {
"type": "validation_error",
"location": "body",
"details": [
{"path": "/email", "code": "value_error", "message": "value is not a valid email address"},
{"path": "/age", "code": "greater_than_equal", "message": "Input should be greater than or equal to 13"}
]
}
}How you map RequestValidationError to an HTTP 400 response depends on your existing error response system; the key is that details is always a list of objects with path, code, and message.
Safe parsing, defaults, and strict/unknown field handling
Safe JSON parsing
When reading JSON, avoid assuming it exists or is an object. Use Flask’s JSON parsing in a safe mode and validate the result.
from flask import request
payload = request.get_json(silent=True)
# payload can be None, a list, a dict, etc.
user_in = parse_json(UserCreateIn, payload)If payload is None or not a dict, Pydantic will raise a validation error that your helper normalizes.
Defaults for query params
Defaults belong in the schema, not scattered across routes. If page is omitted, UserListQuery.page becomes 1.
q = parse_query(UserListQuery, request.args)
# q.page, q.per_page, q.sort, q.order are now validated and defaultedStrict vs permissive behavior
Decide how strict you want to be:
- Unknown fields:
extra="forbid"rejects unexpected keys (good for catching client bugs and preventing silent acceptance). - Type coercion: Pydantic can coerce strings to ints; if you want to reject that, use strict types (e.g.,
StrictInt) or pre-parse query params yourself.
For query params specifically, many APIs accept ?page=2 as a string and parse it as an int. If you want that behavior, keep conint (non-strict). If you want strict rejection, use StrictInt and convert request.args values explicitly before validation.
Create endpoint example (POST): validate nested JSON and serialize output
Route logic using schemas
from flask import Blueprint, jsonify, request
bp = Blueprint("users", __name__)
@bp.post("/users")
def create_user():
user_in = parse_json(UserCreateIn, request.get_json(silent=True))
# Example persistence (pseudo-code):
# user = User(
# email=user_in.email,
# name=user_in.name,
# age=user_in.age,
# address=user_in.address.model_dump(),
# )
# db.session.add(user); db.session.commit()
user = {
"id": 123,
"email": user_in.email,
"name": user_in.name,
"age": user_in.age,
"created_at": "2026-01-16T10:00:00Z",
"address": user_in.address.model_dump(),
}
out = UserOut.model_validate(user)
return jsonify(out.model_dump(mode="json")), 201Notes:
addressis validated as a nested object; missingpostal_codewill produce a path like/address/postal_code.model_dump(mode="json")ensures JSON-friendly serialization (e.g., datetimes).
Update endpoint examples: PUT vs PATCH
Full update (PUT): required fields
@bp.put("/users/<int:user_id>")
def update_user(user_id: int):
user_in = parse_json(UserUpdateIn, request.get_json(silent=True))
# Replace all mutable fields (pseudo-code)
# user = User.query.get_or_404(user_id)
# user.name = user_in.name
# user.age = user_in.age
# user.address = user_in.address.model_dump()
# db.session.commit()
updated = {
"id": user_id,
"email": "existing@example.com",
"name": user_in.name,
"age": user_in.age,
"created_at": "2026-01-01T09:00:00Z",
"address": user_in.address.model_dump(),
}
return jsonify(UserOut.model_validate(updated).model_dump(mode="json"))Partial update (PATCH): optional fields only
For PATCH, validate that provided fields are valid, but do not require all fields. Use a schema where fields are optional and then apply only those present.
@bp.patch("/users/<int:user_id>")
def patch_user(user_id: int):
patch_in = parse_json(UserPatchIn, request.get_json(silent=True))
changes = patch_in.model_dump(exclude_unset=True)
# changes contains only keys actually provided by the client
# Example: {"age": 40} or {"address": {"city": "X", ...}}
# pseudo-code:
# user = User.query.get_or_404(user_id)
# if "name" in changes: user.name = changes["name"]
# if "age" in changes: user.age = changes["age"]
# if "address" in changes: user.address = changes["address"]
# db.session.commit()
patched = {
"id": user_id,
"email": "existing@example.com",
"name": changes.get("name", "Existing Name"),
"age": changes.get("age", 30),
"created_at": "2026-01-01T09:00:00Z",
"address": changes.get("address", {"line1": "Old", "city": "Old", "postal_code": "000"}),
}
return jsonify(UserOut.model_validate(patched).model_dump(mode="json"))Important detail: exclude_unset=True distinguishes “missing” from “explicitly set to null”. If you want to allow clients to clear a field by sending null, keep the field type as Optional[...] and handle None explicitly in your patch application logic.
Validating query params (filtering, pagination) with consistent defaults
Query params are often a source of subtle bugs (negative page numbers, invalid sort keys). Put the rules in a schema and validate once.
@bp.get("/users")
def list_users():
q = parse_query(UserListQuery, request.args)
# pseudo-code:
# query = User.query
# query = query.order_by(getattr(User, q.sort).desc() if q.order == "desc" else getattr(User, q.sort).asc())
# items = query.paginate(page=q.page, per_page=q.per_page)
payload = {
"page": q.page,
"per_page": q.per_page,
"sort": q.sort,
"order": q.order,
"items": [],
}
return jsonify(payload)Strict handling of unknown fields (and when to relax it)
With extra="forbid", sending {"name":"A","age":20,"role":"admin"} to UserCreateIn will fail validation. This is usually desirable for public APIs because:
- Clients learn quickly when they send unsupported fields.
- You avoid silently ignoring typos (e.g.,
emali). - You reduce the risk of accidentally accepting sensitive fields later.
If you have a transitional period where clients may send extra keys, you can relax per-schema:
class LenientSchema(BaseModel):
model_config = ConfigDict(extra="ignore")Use this sparingly and intentionally.
Nested objects and error paths
Nested validation is where schema approaches shine. If postal_code is invalid, the normalized error path should point precisely to the nested field.
Example invalid payload:
{
"email": "a@example.com",
"name": "A",
"age": 20,
"address": {"line1": "X", "city": "Y", "postal_code": "!!!"}
}Normalized error detail should include:
{"path": "/address/postal_code", "code": "string_pattern_mismatch", "message": "String should match pattern '^[0-9A-Za-z\- ]{3,12}$'"}Tests: assert validation error payload structure
Write tests that verify both the HTTP status and the shape of the error payload. This prevents accidental breaking changes in error formatting.
Pytest examples
import pytest
def assert_validation_error(resp, *, location: str):
assert resp.status_code == 400
data = resp.get_json()
assert "error" in data
err = data["error"]
assert err["type"] == "validation_error"
assert err["location"] == location
assert isinstance(err["details"], list)
assert all(set(d.keys()) == {"path", "code", "message"} for d in err["details"])
def test_create_user_missing_required_fields(client):
resp = client.post("/users", json={"name": "A"})
assert_validation_error(resp, location="body")
details = resp.get_json()["error"]["details"]
paths = {d["path"] for d in details}
assert "/email" in paths
assert "/age" in paths
assert "/address" in paths
def test_create_user_rejects_unknown_fields(client):
resp = client.post(
"/users",
json={
"email": "a@example.com",
"name": "A",
"age": 20,
"address": {"line1": "X", "city": "Y", "postal_code": "12345"},
"unexpected": "nope",
},
)
assert_validation_error(resp, location="body")
details = resp.get_json()["error"]["details"]
assert any(d["path"] == "/unexpected" for d in details)
def test_list_users_invalid_query_params(client):
resp = client.get("/users?page=0&per_page=999&sort=drop_table")
assert_validation_error(resp, location="query")
details = resp.get_json()["error"]["details"]
paths = {d["path"] for d in details}
assert "/page" in paths
assert "/per_page" in paths
assert "/sort" in paths
def test_patch_user_partial_update_validates_only_provided_fields(client):
# age too low
resp = client.patch("/users/1", json={"age": 10})
assert_validation_error(resp, location="body")
details = resp.get_json()["error"]["details"]
assert any(d["path"] == "/age" for d in details)
# empty payload is acceptable for PATCH in many APIs (no changes)
resp2 = client.patch("/users/1", json={})
assert resp2.status_code in (200, 204)These tests focus on invariants:
- Validation failures always return
400. - Error payload always includes
type,location, and a list ofdetails. - Each detail includes
path,code, andmessage.
Practical step-by-step checklist for adding validation to a new endpoint
Define an input schema for the endpoint (body and/or query), including required fields, ranges, and formats.
Decide strictness: forbid unknown fields (
extra="forbid") and choose strict vs coercing types.Parse safely: use
request.get_json(silent=True)and validate viaparse_json; validate query viaparse_query.Apply partial updates with
exclude_unset=Truefor PATCH.Serialize outputs through an output schema (
UserOut) to keep response shapes stable.Add tests that assert error payload structure and key paths for common invalid inputs.