Why payload design matters: flexibility vs. guarantees
In an event-driven database, the event payload is the part that carries business facts: what happened, to whom, and with what attributes. PostgreSQL gives you two strong but different tools for payloads: JSONB (flexible, schema-on-read) and typed columns (strict, schema-on-write). Designing payloads well is not about picking one and ignoring the other; it is about choosing where you need flexibility, where you need guarantees, and how you will evolve the payload over time without breaking consumers.
JSONB is excellent for capturing event-specific attributes that vary by event type, evolve frequently, or are sparse. Typed columns are excellent for attributes that are queried often, must be validated, indexed efficiently, or participate in constraints and joins. A robust design commonly uses a hybrid: a small set of typed “envelope” columns for stable, high-value fields, plus a JSONB “payload” for the rest.
Core patterns for event payloads in PostgreSQL
Pattern A: All-JSONB payloads
In this pattern, the event table has a JSONB column (often named data or payload) that contains most event attributes. You still typically keep some typed columns for routing and filtering (like event_type), but the business fields live in JSONB.
- Pros: fast iteration, easy to add fields, can store nested structures naturally.
- Cons: weaker guarantees unless you add explicit checks, more complex indexing, risk of inconsistent shapes across producers.
Pattern B: Fully typed event tables per event type
Here, each event type (or small family) gets its own table with typed columns. This gives maximum validation and query performance but can lead to many tables and more migration work as payloads evolve.
- Pros: strong constraints, simple SQL, predictable performance.
- Cons: schema churn, more DDL, harder to support “unknown future fields”.
Pattern C: Hybrid envelope + JSONB payload (recommended default)
This pattern uses typed columns for stable, frequently queried fields and JSONB for the variable part. It is often the best balance for event-driven modeling in PostgreSQL.
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
- Pros: strong guarantees where needed, flexible evolution elsewhere, good indexing options.
- Cons: requires discipline: decide what belongs in typed columns vs JSONB, and enforce JSON shape for critical fields.
Designing the envelope: what should be typed?
Even when you store most attributes in JSONB, you should identify a minimal set of typed columns that make the system operable and queryable. The exact set depends on your domain, but the following categories are common:
- Event classification:
event_type(text),event_version(int), sometimesschema_name(text) if you have multiple domains. - Entity targeting:
entity_type(text) andentity_id(uuid/text) when you frequently filter by “all events for X”. - Correlation:
correlation_id,causation_id(uuid/text) to connect events to a request or parent event. - Producer metadata:
producer(text),trace_id(text/uuid) if you integrate with tracing. - Operational flags:
is_redacted(boolean),contains_pii(boolean) if you need compliance controls.
Typed columns are also useful for fields that you will index heavily. Indexing JSONB is possible, but you should not force every common query to extract from JSONB if the field is stable and central to your access patterns.
Designing the JSONB payload: shape, naming, and evolution
Prefer stable top-level keys and consistent types
JSONB makes it easy to add fields, but it also makes it easy to accidentally create multiple shapes for the “same” event type. Establish conventions:
- Use consistent key naming: choose
snake_caseorcamelCaseand stick to it. - Keep stable keys at the top level: for example,
amount,currency,reason. - Use nested objects for grouped data:
customer,address,payment_method. - Keep types stable: do not switch a field from string to number later; add a new field if needed.
Include explicit versioning for payload changes
When payloads evolve, consumers need a way to interpret them. A practical approach is to store an integer event_version as a typed column and treat it as the version of the payload schema for that event_type. When you introduce a breaking change (rename a field, change meaning), increment event_version and support both versions in consumers until migration is complete.
Represent optionality explicitly
In JSON, missing keys and null are different. Decide a rule:
- Missing key means “not applicable / not provided”.
nullmeans “known to be empty” (use sparingly).
This matters for queries and constraints. If you want to enforce that a field exists, check for key existence rather than non-null value.
Step-by-step: implementing a hybrid event table
Step 1: Create the table with typed envelope + JSONB payload
The following example shows a single event table that can store multiple event types, with typed columns for common access patterns and a JSONB payload for event-specific attributes.
CREATE TABLE app_event_log ( event_id uuid PRIMARY KEY, event_type text NOT NULL, event_version integer NOT NULL DEFAULT 1, entity_type text NOT NULL, entity_id uuid NOT NULL, correlation_id uuid, causation_id uuid, producer text NOT NULL, occurred_at timestamptz NOT NULL, payload jsonb NOT NULL, headers jsonb NOT NULL DEFAULT '{}'::jsonb );payload holds business data. headers can hold operational metadata that may vary (for example, request IP, user agent, feature flags). Keeping these separate helps you avoid mixing business facts with transport/operational details.
Step 2: Add basic JSONB sanity constraints
JSONB columns accept any JSON value. For event payloads, you usually want an object at the top level.
ALTER TABLE app_event_log ADD CONSTRAINT payload_is_object CHECK (jsonb_typeof(payload) = 'object');If you also want headers to be an object:
ALTER TABLE app_event_log ADD CONSTRAINT headers_is_object CHECK (jsonb_typeof(headers) = 'object');Step 3: Enforce required keys per event type (targeted checks)
For critical event types, enforce that required keys exist. PostgreSQL’s JSONB operators make this straightforward. Example: for invoice_issued you require invoice_id, amount, and currency.
ALTER TABLE app_event_log ADD CONSTRAINT invoice_issued_required_keys CHECK ( event_type <> 'invoice_issued' OR (payload ? 'invoice_id' AND payload ? 'amount' AND payload ? 'currency') );You can also enforce types. Example: amount should be numeric (in JSON it will be a number), and currency should be a string.
ALTER TABLE app_event_log ADD CONSTRAINT invoice_issued_key_types CHECK ( event_type <> 'invoice_issued' OR ( jsonb_typeof(payload->'amount') = 'number' AND jsonb_typeof(payload->'currency') = 'string' ) );This approach scales when you apply it only to event types where correctness is essential and the schema is stable enough to justify constraints.
Step 4: Promote frequently queried fields into typed columns (generated columns)
A common pain point is repeatedly extracting the same JSONB keys in queries and indexes. PostgreSQL supports generated columns, which can “materialize” extracted values as typed columns. This gives you typed access and indexing while still storing the source of truth in JSONB.
Example: promote payload.invoice_id and payload.amount for invoice_issued events.
ALTER TABLE app_event_log ADD COLUMN invoice_id uuid GENERATED ALWAYS AS ( CASE WHEN event_type = 'invoice_issued' THEN (payload->>'invoice_id')::uuid ELSE NULL END ) STORED, ADD COLUMN invoice_amount numeric GENERATED ALWAYS AS ( CASE WHEN event_type = 'invoice_issued' THEN (payload->>'amount')::numeric ELSE NULL END ) STORED;Now you can index and query these columns efficiently without repeating JSON extraction logic everywhere.
Step 5: Add indexes for both typed and JSONB access
Indexes should reflect your read patterns. With the hybrid approach, you typically combine:
- B-tree indexes on typed columns for equality/range filters.
- GIN indexes on JSONB for containment queries when you need flexible filtering.
Examples:
CREATE INDEX app_event_log_entity_idx ON app_event_log (entity_type, entity_id, occurred_at); CREATE INDEX app_event_log_type_time_idx ON app_event_log (event_type, occurred_at);For JSONB containment queries (e.g., find events where payload contains a key/value pair):
CREATE INDEX app_event_log_payload_gin ON app_event_log USING gin (payload jsonb_path_ops);jsonb_path_ops is efficient for containment (@>) but supports fewer operator classes than the default. If you need broader JSONB querying, use the default GIN operator class:
CREATE INDEX app_event_log_payload_gin_default ON app_event_log USING gin (payload);Choose one based on your actual query operators; avoid creating both unless you have measured the need.
Querying JSONB payloads safely and efficiently
Extracting scalar values
Use ->> to extract text and cast to the desired type. Example: filter invoice events by currency.
SELECT event_id, occurred_at, payload->>'currency' AS currency FROM app_event_log WHERE event_type = 'invoice_issued' AND payload->>'currency' = 'USD';If you cast, guard against missing keys or wrong types by combining with key checks or constraints. Example:
SELECT event_id, (payload->>'amount')::numeric AS amount FROM app_event_log WHERE event_type = 'invoice_issued' AND payload ? 'amount';Containment queries with @>
Containment is powerful for matching partial JSON structures. Example: find events where status is cancelled.
SELECT event_id, occurred_at FROM app_event_log WHERE payload @> '{"status":"cancelled"}'::jsonb;This is where a GIN index on payload can help significantly.
Querying nested structures
Example: payload has customer object with country.
SELECT event_id FROM app_event_log WHERE event_type = 'invoice_issued' AND payload->'customer'->>'country' = 'DE';If this becomes a common filter, consider promoting customer.country to a generated column and indexing it.
Typed columns: when strict schema is the better choice
Typed columns are not only about performance; they are about correctness and maintainability. Prefer typed columns when:
- You need constraints: foreign keys, check constraints, uniqueness, non-null requirements.
- You need consistent numeric semantics: money amounts, quantities, rates, where casting from JSON text repeatedly is risky.
- You need efficient joins: joining on a JSON-extracted value is possible but often slower and harder to index well.
- You need stable reporting: BI queries benefit from typed columns and predictable schemas.
A practical compromise is to keep the canonical value in typed columns and store a JSONB payload for additional context. For example, store amount and currency as typed columns, and keep payload for optional details like discount breakdown or UI-specific metadata.
Step-by-step: designing payloads for a small set of event types
Consider three event types in a billing domain: invoice_issued, payment_received, and invoice_cancelled. The goal is to design payloads that are consistent, evolvable, and queryable.
Step 1: Define minimal required fields per event type
- invoice_issued:
invoice_id,customer_id,amount,currency,due_date - payment_received:
payment_id,invoice_id,amount,currency,method - invoice_cancelled:
invoice_id,reason
Keep identifiers consistently typed in JSON (strings that can be cast to UUID) and keep amounts as JSON numbers.
Step 2: Create constraints for required keys and types
ALTER TABLE app_event_log ADD CONSTRAINT payment_received_required CHECK ( event_type <> 'payment_received' OR ( payload ? 'payment_id' AND payload ? 'invoice_id' AND payload ? 'amount' AND payload ? 'currency' AND payload ? 'method' ) ); ALTER TABLE app_event_log ADD CONSTRAINT payment_received_types CHECK ( event_type <> 'payment_received' OR ( jsonb_typeof(payload->'amount') = 'number' AND jsonb_typeof(payload->'currency') = 'string' AND jsonb_typeof(payload->'method') = 'string' ) );This does not require you to fully formalize every field for every event type, but it prevents the most damaging inconsistencies.
Step 3: Promote cross-event join keys
If you frequently join events by invoice_id across multiple event types, promote it. You can do this with a generated column that extracts invoice_id for the relevant event types.
ALTER TABLE app_event_log ADD COLUMN invoice_id_promoted uuid GENERATED ALWAYS AS ( CASE WHEN payload ? 'invoice_id' THEN (payload->>'invoice_id')::uuid ELSE NULL END ) STORED; CREATE INDEX app_event_log_invoice_id_promoted_idx ON app_event_log (invoice_id_promoted, occurred_at);Now you can efficiently query “all events for invoice X” without caring which event type produced them.
Schema evolution strategies with JSONB and typed columns
Additive changes: add new keys
Adding a new key to JSONB is typically safe for consumers that ignore unknown fields. When you add a key that becomes important for filtering or reporting, consider promoting it to a generated column later, after it stabilizes.
Renames and semantic changes: introduce new keys and deprecate old ones
Renaming a key in-place forces consumers to update in lockstep. A safer approach is:
- Add the new key (e.g.,
total_amount). - Keep the old key (e.g.,
amount) for a transition period. - Optionally increment
event_versionfor new events. - Update consumers to prefer the new key, fallback to the old key.
In PostgreSQL queries, you can coalesce:
SELECT COALESCE(payload->>'total_amount', payload->>'amount') AS amount_text FROM app_event_log WHERE event_type = 'invoice_issued';When to migrate JSONB fields into typed columns
Promoting fields is a common evolution step. A practical workflow:
- Observe: identify fields that appear in many queries or dashboards.
- Stabilize: ensure the field’s meaning and type are stable across producers.
- Promote: add a generated column (or a regular column populated by a backfill) and index it.
- Enforce: add constraints ensuring the JSONB field exists and has the correct type for relevant event types.
If you need to support older events that lack the field, keep the promoted column nullable and handle nulls in queries.
Validation options beyond simple CHECK constraints
JSON Schema validation (application-side or database-side)
PostgreSQL does not include built-in JSON Schema validation, but you can still enforce structure in two ways:
- Application-side validation: validate payloads before insert using a JSON Schema library, then store validated JSONB.
- Database-side validation: use CHECK constraints for critical keys/types and optionally PL/pgSQL triggers for more complex rules.
Database-side validation is valuable when multiple producers write to the same database and you need a single enforcement point.
Trigger-based validation for complex rules
Some rules are hard to express as a single CHECK constraint, such as conditional requirements based on multiple fields. A trigger can validate and raise an error. Keep trigger logic small and deterministic to avoid operational surprises.
CREATE OR REPLACE FUNCTION validate_invoice_issued_payload() RETURNS trigger AS $$ BEGIN IF NEW.event_type = 'invoice_issued' THEN IF NOT (NEW.payload ? 'due_date') THEN RAISE EXCEPTION 'invoice_issued requires due_date'; END IF; IF jsonb_typeof(NEW.payload->'due_date') <> 'string' THEN RAISE EXCEPTION 'due_date must be a string (ISO-8601)'; END IF; END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER trg_validate_invoice_issued BEFORE INSERT OR UPDATE ON app_event_log FOR EACH ROW EXECUTE FUNCTION validate_invoice_issued_payload();This example enforces presence and type. If you want to validate date format, you can attempt a cast to timestamptz and catch exceptions, but be careful: heavy parsing in triggers can become a bottleneck at high write rates.
Handling arrays and repeated structures in payloads
Events sometimes include arrays, such as line items on an invoice. JSONB is a natural fit, but querying arrays can be more expensive. Example payload:
{ "invoice_id": "3d2f...", "currency": "USD", "items": [ {"sku": "A1", "qty": 2, "unit_price": 10.00}, {"sku": "B2", "qty": 1, "unit_price": 25.00} ]}If you only need the array for downstream processing, keep it in JSONB. If you need frequent analytics (e.g., top SKUs), consider extracting items into a separate relational table keyed by event_id (or by invoice_id), while still storing the original JSONB payload for completeness.
For occasional queries, you can use jsonb_array_elements:
SELECT e.event_id, item->>'sku' AS sku, (item->>'qty')::int AS qty FROM app_event_log e CROSS JOIN LATERAL jsonb_array_elements(e.payload->'items') AS item WHERE e.event_type = 'invoice_issued';Use this sparingly for large datasets; it can expand rows significantly.
Security and privacy considerations in payload design
JSONB payloads can easily accumulate sensitive data because they are flexible. Establish rules about what is allowed in payloads and what must be stored elsewhere (or not stored at all). Practical measures:
- Separate PII into dedicated fields or tables with stricter access controls, and store only references in the event payload.
- Use typed flags like
contains_piito support auditing and selective access. - Consider redaction workflows where
payloadcan be replaced with a redacted form andis_redactedset to true, while keeping the event record.
From a querying perspective, keeping sensitive fields out of JSONB reduces the chance of accidental exposure through ad-hoc JSON extraction.
Practical checklist: choosing JSONB vs typed columns per field
- Query frequency: if you filter/join/group by it often, prefer typed (or generated) columns.
- Validation need: if wrong values would cause incorrect downstream effects, prefer typed + constraints.
- Volatility: if the field changes often or is experimental, keep it in JSONB first.
- Sparsity: if only a small fraction of events have it, JSONB avoids many nullable columns.
- Structure: nested/variable structures fit JSONB; flat stable attributes fit typed columns.
- Indexing: if you need fast lookups, typed columns are simplest; JSONB can work with GIN but requires careful operator choice.