Constraints and Data Quality Rules: Capturing Business Meaning in the Schema

Capítulo 7

Estimated reading time: 10 minutes

+ Exercise

1) Expressing Business Rules with NOT NULL, UNIQUE, and CHECK

Constraints are database-enforced rules that prevent invalid data from being stored. They turn business meaning into guarantees: if the data is in the table, it already satisfies key domain rules. This reduces downstream cleanup, simplifies queries, and makes errors visible at the point of data entry.

NOT NULL: required facts, not “nice to have”

Use NOT NULL when a value must always exist for a row to be meaningful. Treat it as a rule about the domain, not about the UI. If a value can be unknown at creation time, it should usually be nullable (or modeled differently), even if the application often provides it.

Examples tied to business rules

  • An order line must have a quantity: quantity NOT NULL
  • An order must have a status: status NOT NULL
  • A customer must have an email to receive receipts: email NOT NULL (only if the business truly requires it for all customers)
CREATE TABLE order_line (  order_line_id  BIGINT PRIMARY KEY,  order_id       BIGINT NOT NULL,  product_id     BIGINT NOT NULL,  quantity       INTEGER NOT NULL);

UNIQUE: prevent duplicates that break meaning

Use UNIQUE when the business says “there must not be two rows with the same value.” This is different from a primary key: a primary key identifies the row; a unique constraint protects a business identifier or other “must be distinct” attribute.

Examples tied to business rules

Continue in our app.
  • Listen to the audio with the screen off.
  • Earn a certificate upon completion.
  • Over 5000 courses for you to explore!
Or continue reading below...
Download App

Download the app

  • Each customer email can be used by at most one customer account: UNIQUE(email)
  • Each invoice number must be unique: UNIQUE(invoice_number)
CREATE TABLE customer (  customer_id BIGINT PRIMARY KEY,  email       VARCHAR(320) NOT NULL,  full_name   VARCHAR(200) NOT NULL,  CONSTRAINT uq_customer_email UNIQUE (email));

Tip: decide whether uniqueness should be case-sensitive (depends on collation and database). If the business treats Alice@x.com and alice@x.com as the same, you may need a normalized column (e.g., lowercased) and constrain that.

CHECK: validate allowed ranges and sets

CHECK constraints enforce predicates that must be true for every row. They are ideal for numeric ranges (non-negative quantities) and enumerated sets (allowed status values).

Examples tied to business rules

  • Quantity cannot be negative: quantity > 0 (or >= 0 if zero is allowed)
  • Unit price cannot be negative: unit_price >= 0
  • Status must be one of a known set: status IN ('DRAFT','PLACED','PAID','CANCELLED')
CREATE TABLE orders (  order_id     BIGINT PRIMARY KEY,  customer_id  BIGINT NOT NULL,  status       VARCHAR(20) NOT NULL,  created_at   TIMESTAMP NOT NULL,  CONSTRAINT ck_orders_status CHECK (status IN ('DRAFT','PLACED','PAID','CANCELLED')));
CREATE TABLE order_line (  order_line_id BIGINT PRIMARY KEY,  order_id      BIGINT NOT NULL,  product_id    BIGINT NOT NULL,  quantity      INTEGER NOT NULL,  unit_price    NUMERIC(12,2) NOT NULL,  CONSTRAINT ck_order_line_quantity CHECK (quantity > 0),  CONSTRAINT ck_order_line_unit_price CHECK (unit_price >= 0));

Design note: keep CHECK rules stable and explainable. If a rule changes frequently (e.g., promotional price bounds), it may belong in a table-driven rule system rather than a hard-coded check.

2) Default Values: When They Help vs. When They Hide Missing Data

Defaults automatically fill a value when an insert omits the column. They can improve consistency and reduce repetitive application code, but they can also hide missing data by silently inserting a value that looks intentional.

Defaults that usually help

  • Timestamps for creation: if every row should record when it was created, a default avoids “forgot to set it” bugs.
  • Booleans with a clear domain meaning: e.g., is_active defaulting to true for new records.
  • Initial workflow state: if every new order starts as DRAFT unless explicitly created in another state.
CREATE TABLE orders (  order_id    BIGINT PRIMARY KEY,  customer_id BIGINT NOT NULL,  status      VARCHAR(20) NOT NULL DEFAULT 'DRAFT',  created_at  TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,  CONSTRAINT ck_orders_status CHECK (status IN ('DRAFT','PLACED','PAID','CANCELLED')));

Defaults that can hide missing or unknown data

  • Using 0 as a default for a meaningful numeric value: defaulting discount_amount to 0 might be fine if “no discount” is truly the same as 0; but defaulting shipping_cost to 0 could hide “not calculated yet.”
  • Using placeholder strings: defaulting phone to 'N/A' creates data that looks real but is not usable.
  • Defaulting status to a later stage: e.g., defaulting to PAID would be dangerous because it masks missing payment processing.

Practical guideline: use defaults for values that are truly deterministic at insert time. If the correct value depends on a later process, prefer NULL (and possibly a separate “state” column) so missingness is visible.

3) Enforcing Uniqueness Across Multiple Columns (Composite UNIQUE)

Some business rules define uniqueness not by a single attribute but by a combination. A composite unique constraint enforces “no duplicates within a scope.” This is common for line items, memberships, schedules, and many-to-many association tables.

Common patterns

  • One product per order: an order cannot have the same product listed twice as separate lines (unless your domain allows it). Enforce uniqueness on (order_id, product_id).
  • One role per user per project: enforce uniqueness on (project_id, user_id, role) depending on the rule.
  • One seat per event: enforce uniqueness on (event_id, seat_number).
CREATE TABLE order_line (  order_line_id BIGINT PRIMARY KEY,  order_id      BIGINT NOT NULL,  product_id    BIGINT NOT NULL,  quantity      INTEGER NOT NULL,  unit_price    NUMERIC(12,2) NOT NULL,  CONSTRAINT uq_order_line_order_product UNIQUE (order_id, product_id),  CONSTRAINT ck_order_line_quantity CHECK (quantity > 0),  CONSTRAINT ck_order_line_unit_price CHECK (unit_price >= 0));

Step-by-step: deciding a composite unique constraint

  • Step 1: State the business rule in plain language (e.g., “An order can contain a product at most once”).
  • Step 2: Identify the columns that define the scope (order_id) and the item within that scope (product_id).
  • Step 3: Add a UNIQUE(order_id, product_id) constraint.
  • Step 4: Check edge cases: does the business allow duplicate products as separate lines for different pricing? If yes, uniqueness may need to include another column (e.g., price_tier) or be removed.

4) What to Validate in the Database vs. the Application (and How to Document It)

Not every rule belongs in the database. A good division of responsibility keeps the schema robust while avoiding constraints that are hard to maintain or that block valid workflows.

Good candidates for database constraints

  • Always true invariants: quantity non-negative, status in allowed set, required columns.
  • Uniqueness rules: duplicates are almost always harmful and expensive to clean later.
  • Simple cross-field rules within a row: e.g., start_date <= end_date (when both are present).

Often better in the application (or in a dedicated rules layer)

  • Complex conditional rules that change frequently: “VIP customers can exceed credit limit if approved,” “discount depends on campaign.”
  • Rules requiring external context: validating an address via an external service, checking inventory in real time.
  • Multi-row or aggregate constraints: “total order amount must equal sum of lines” is possible with triggers/materialized approaches, but often handled in application logic due to complexity and performance concerns.

Documenting the rationale (so future changes are safe)

For each important rule, capture: (1) the rule statement, (2) where it is enforced, and (3) why. Keep this near the schema (migration comments, schema docs) so it stays current.

Rule: Order status must be one of DRAFT/PLACED/PAID/CANCELLED. Enforced in: database CHECK constraint ck_orders_status. Rationale: invariant domain set; prevents invalid states from any writer (batch jobs, admin tools, imports). Rule: Payment authorization required before moving to PAID. Enforced in: application service layer. Rationale: depends on external payment gateway response and audit workflow; changes with provider policies.

Constraint-Design Lab: Add Constraints and Predict Blocked Bad Data

In this lab, you will take a draft schema and harden it with constraints. Then you will predict which invalid inserts will be rejected.

Draft schema (intentionally under-constrained)

CREATE TABLE customer (  customer_id BIGINT PRIMARY KEY,  email       VARCHAR(320),  full_name   VARCHAR(200)); CREATE TABLE orders (  order_id    BIGINT PRIMARY KEY,  customer_id BIGINT,  status      VARCHAR(20),  created_at  TIMESTAMP); CREATE TABLE order_line (  order_line_id BIGINT PRIMARY KEY,  order_id      BIGINT,  product_id    BIGINT,  quantity      INTEGER,  unit_price    NUMERIC(12,2));

Task A: Add constraints (step-by-step)

Step 1: Mark required columns with NOT NULL

  • customer.email and customer.full_name if your business requires both for every customer record.
  • orders.customer_id, orders.status, orders.created_at.
  • order_line.order_id, order_line.product_id, order_line.quantity, order_line.unit_price.

Step 2: Add uniqueness rules

  • UNIQUE(customer.email) to prevent duplicate accounts by email.
  • UNIQUE(order_line.order_id, order_line.product_id) if one product should appear at most once per order.

Step 3: Add CHECK constraints for domain validity

  • order_line.quantity > 0
  • order_line.unit_price >= 0
  • orders.status IN ('DRAFT','PLACED','PAID','CANCELLED')

Step 4: Add safe defaults (only where deterministic)

  • orders.created_at DEFAULT CURRENT_TIMESTAMP
  • orders.status DEFAULT 'DRAFT' (only if every new order starts in DRAFT)

One possible hardened version

CREATE TABLE customer (  customer_id BIGINT PRIMARY KEY,  email       VARCHAR(320) NOT NULL,  full_name   VARCHAR(200) NOT NULL,  CONSTRAINT uq_customer_email UNIQUE (email)); CREATE TABLE orders (  order_id    BIGINT PRIMARY KEY,  customer_id BIGINT NOT NULL,  status      VARCHAR(20) NOT NULL DEFAULT 'DRAFT',  created_at  TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,  CONSTRAINT ck_orders_status CHECK (status IN ('DRAFT','PLACED','PAID','CANCELLED'))); CREATE TABLE order_line (  order_line_id BIGINT PRIMARY KEY,  order_id      BIGINT NOT NULL,  product_id    BIGINT NOT NULL,  quantity      INTEGER NOT NULL,  unit_price    NUMERIC(12,2) NOT NULL,  CONSTRAINT uq_order_line_order_product UNIQUE (order_id, product_id),  CONSTRAINT ck_order_line_quantity CHECK (quantity > 0),  CONSTRAINT ck_order_line_unit_price CHECK (unit_price >= 0));

Task B: Predict which bad data is blocked

For each insert below, decide whether it will succeed or fail, and which constraint causes the failure.

  • Bad data 1: customer without email
INSERT INTO customer(customer_id, email, full_name) VALUES (1, NULL, 'A. Buyer');
  • Bad data 2: two customers with the same email
INSERT INTO customer(customer_id, email, full_name) VALUES (2, 'a@example.com', 'Alice'); INSERT INTO customer(customer_id, email, full_name) VALUES (3, 'a@example.com', 'Alicia');
  • Bad data 3: order with an invalid status
INSERT INTO orders(order_id, customer_id, status, created_at) VALUES (10, 2, 'SHIPPED', CURRENT_TIMESTAMP);
  • Bad data 4: order line with negative quantity
INSERT INTO order_line(order_line_id, order_id, product_id, quantity, unit_price) VALUES (100, 10, 501, -2, 9.99);
  • Bad data 5: order line with negative unit price
INSERT INTO order_line(order_line_id, order_id, product_id, quantity, unit_price) VALUES (101, 10, 502, 1, -9.99);
  • Bad data 6: duplicate product on the same order (if composite unique is present)
INSERT INTO order_line(order_line_id, order_id, product_id, quantity, unit_price) VALUES (102, 10, 600, 1, 5.00); INSERT INTO order_line(order_line_id, order_id, product_id, quantity, unit_price) VALUES (103, 10, 600, 2, 5.00);

Extension: If your business allows the same product multiple times when the unit price differs, adjust the uniqueness rule and re-evaluate which “bad data” is actually valid. For example, you might remove the composite unique constraint or change it to include a pricing discriminator column.

Now answer the exercise about the content:

Which schema feature best enforces the rule “an order cannot contain the same product more than once as separate lines,” while still allowing many orders to include that product?

You are right! Congratulations, now go to the next page

You missed! Try again.

A composite UNIQUE on (order_id, product_id) prevents duplicate products within the same order while allowing the same product_id across different orders.

Next chapter

Avoiding Common Modeling Mistakes: Practical Fixes for Cleaner Schemas

Arrow Right Icon
Free Ebook cover Database Design Basics: From Requirements to a Clean Schema
78%

Database Design Basics: From Requirements to a Clean Schema

New course

9 pages

Download the app to earn free Certification and listen to the courses in the background, even with the screen off.