Imagine you are asked to design a database for a small online bakery that takes pre-orders for pickup. Stakeholders say: “Customers place orders for pickup slots. We need to know what’s selling, what’s still open, and which orders are paid. Staff should be able to update product availability and prices. Customers can cancel up to 24 hours before pickup.” Your job is to turn these statements into stable data facts the database can store and answer questions about.
1) Identify actors, events, and objects in the domain
Start by separating the world into: (a) who interacts with the system (actors), (b) what happens (events), and (c) what things exist (objects). This helps you avoid jumping straight into tables without understanding the domain.
Actors
- Customer: places orders, pays, cancels
- Staff member: manages products, availability, prices; may mark orders as fulfilled
- Payment provider (optional external actor): confirms payment status
Events
- Order placed (customer submits an order)
- Order paid (payment captured/confirmed)
- Order canceled (customer cancels within policy)
- Pickup slot reserved (order assigned to a pickup time window)
- Product price updated (staff changes price)
- Product availability updated (staff changes stock/availability)
Objects (things the database likely stores)
- Customer
- Order
- Order line item (which products and quantities are in an order)
- Product
- Pickup slot (date/time window, capacity)
- Payment (or at least payment status and reference)
Tip: If you can point to it, name it, or count it, it is often an object. If it “happens,” it is often an event. If it “does” something, it is often an actor.
2) Extract candidate data items from phrases while separating process vs data requirements
Next, scan stakeholder statements and extract candidate data items. A practical technique is to mark nouns (potential entities/attributes) and verbs (relationships/events). Then separate process requirements (workflow rules, UI behavior, notifications) from data requirements (facts to store and retrieve).
Step-by-step extraction from the scenario
Statement A: “Customers place orders for pickup slots.”
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
- Nouns: customers, orders, pickup slots
- Verbs: place, for (assign to)
- Candidate data facts: a customer places an order; an order is assigned to a pickup slot
- Likely data items: Customer identifier; Order identifier; PickupSlot identifier; OrderDateTime; PickupSlotStart/End
Statement B: “We need to know what’s selling.”
- Nouns: what (products), selling
- Verbs: know, selling (derived from orders)
- Data requirement: store order line items (product, quantity, price at time of order) so sales can be calculated
- Process requirement: reporting frequency or dashboard is not a data fact; it’s a delivery requirement
Statement C: “We need to know what’s still open.”
- Ambiguity: “open” could mean pickup slots with remaining capacity, or orders not fulfilled, or products available
- Action: turn “open” into a specific data fact (e.g., pickup slots have capacity and reserved count; orders have status)
Statement D: “Which orders are paid.”
- Nouns: orders, paid
- Verbs: are
- Candidate data facts: an order has a payment status; optionally, payments exist as records with timestamps and references
Statement E: “Staff should be able to update product availability and prices.”
- Nouns: staff, product, availability, prices
- Verbs: update
- Data requirement: store product availability and price (and possibly effective dates)
- Process requirement: “should be able to update” implies permissions/UI; the database may need staff user accounts and audit fields, but the UI itself is not stored
Statement F: “Customers can cancel up to 24 hours before pickup.”
- Nouns: customers, cancel, 24 hours, pickup
- Verbs: can cancel
- Data requirement: store cancellation timestamp and order status; store pickup time to evaluate the rule
- Process/business rule: the 24-hour rule is logic enforced by application/services, but it depends on data facts (pickup time, cancellation time)
Separating process vs data requirements (quick guide)
- Data requirement asks: “What facts must be stored so we can answer questions later?”
- Process requirement asks: “What must the system do, when, and who can do it?”
Examples from the scenario:
- Data: Order has status (Placed/Paid/Canceled/Fulfilled); Order has pickup slot; Product has price; Order line item has quantity.
- Process: Staff can update products; customers can cancel within 24 hours; system shows what’s selling.
3) Create a clear list of questions the database must answer
Requirements become actionable when you express them as questions the database must answer. These questions drive what entities, relationships, and attributes you must store.
Core “must answer” questions for the bakery
- Which customer placed which order?
- Which pickup slot is each order assigned to?
- Which products are included in each order, and in what quantities?
- What price was charged for each product in an order at the time it was placed?
- Which orders are paid, and when was payment confirmed?
- Which orders are canceled, and when were they canceled?
- Which orders are still not fulfilled for a given pickup date?
- How many items of each product were sold in a date range?
- Which pickup slots still have capacity available?
- What is the current availability status of each product?
- What is the current price of each product?
Turn questions into required data facts
Pick one question and list the minimum facts needed.
Question: “Which pickup slots still have capacity available?”
- PickupSlot: date, start time, end time, capacity
- Order: assigned pickup slot
- Order status rules: whether canceled orders count against capacity (this is a business rule you must clarify)
Question: “What price was charged for each product in an order at the time it was placed?”
- OrderLineItem: product, quantity, unit_price_charged
- Order: placed timestamp
- Product: current price (not sufficient alone, because it can change)
Notice how these questions reveal subtle needs (e.g., storing the charged price on the line item, not only the current product price).
4) Define scope boundaries and assumptions
A clean database design starts with a clear boundary: what the database will cover, what it will not cover, and what assumptions you are making. This prevents “requirements creep” from silently changing your schema.
In-scope (for this scenario)
- Customers and their identifying/contact information needed for orders
- Orders, order statuses, and timestamps (placed, paid, canceled, fulfilled)
- Order line items (product, quantity, price charged)
- Products, current price, and availability
- Pickup slots (time windows, capacity)
- Payment tracking at least to the level of payment status and reference
Out-of-scope (explicitly not covered)
- Full accounting/ledger system (invoices, tax filings, refunds accounting)
- Delivery logistics (routing, driver assignments), since this is pickup-only
- Marketing automation (email campaigns, coupons) unless explicitly required
- Recipe/ingredient inventory and production planning
- Customer support ticketing/chat history
Assumptions to document (examples)
- A customer can place multiple orders; each order belongs to exactly one customer.
- An order is assigned to exactly one pickup slot.
- A pickup slot has a fixed capacity (number of orders or total items). Clarify which.
- Payment status is tracked per order (Paid/Unpaid/Refunded). If partial payments exist, that changes the model.
- “Availability” means whether a product can be ordered right now; it is not a detailed stock count unless specified.
- Canceled orders do not count against pickup slot capacity (or they do). Decide and record it.
Checklist: spotting ambiguous terms before they become schema problems
Ambiguous terms cause unstable designs because different stakeholders interpret them differently. Use this checklist to force precision.
- “Account”: Is it a login identity, a billing account, a customer profile, or a payment method?
- “Active”: Active customer (placed an order recently?), active product (available for sale?), active order (not canceled?), active pickup slot (future?)
- “Current price”: Current as of now, or current as of the order date? Do prices have effective dates?
- “Open”: Open order (not fulfilled?), open pickup slot (has capacity?), open product (available?), open day (store open hours?)
- “Available”: Available for ordering, in stock, not sold out for a specific pickup date, or not discontinued?
- “Customer”: A person, a household, or a business? Can multiple contacts exist per customer?
- “Order”: Can an order be edited after placement? Are there drafts/carts?
- “Paid”: Authorized, captured, settled, or simply marked as paid by staff?
- “Cancel”: Soft-cancel (status change) vs delete; do we keep cancellation reason?
- Time phrases (“up to 24 hours before pickup”): Is pickup time the start of the slot? What timezone? What about daylight saving changes?
For each ambiguous term, ask: What exact values can it take? Who sets it? When does it change? What decisions depend on it?
Mini-exercise: rewrite vague requirements into testable statements
Rewrite each vague requirement into a statement that can be verified with data (and therefore informs what must be stored).
Vague requirement 1
Original: “We need to know what’s selling.”
Rewrite (testable): “For any date range, the database must return total quantity sold per product based on non-canceled orders, using order line items.”
Data implications: order line items must store product and quantity; orders must store status and placed date.
Vague requirement 2
Original: “Show current price.”
Rewrite (testable): “The database must store one current price per product, and each order line item must store the unit price charged at the time the order was placed.”
Data implications: Product.current_price; OrderLineItem.unit_price_charged.
Vague requirement 3
Original: “Customers can cancel up to 24 hours before pickup.”
Rewrite (testable): “An order can be marked as Canceled only if cancellation_time is at least 24 hours before pickup_slot_start_time; the database must store both timestamps.”
Data implications: Order.canceled_at; PickupSlot.start_time; Order.status.
Vague requirement 4
Original: “We need to see what’s still open.”
Rewrite (testable): “For a given pickup date, the database must list pickup slots with remaining capacity, where remaining capacity = slot_capacity − count(orders assigned to the slot with status not in (Canceled)).”
Data implications: PickupSlot.capacity; Order.pickup_slot_id; Order.status.
Your turn (write your own rewrites)
- “Make it easy for staff to manage products.”
- “Track loyal customers.”
- “Orders should be handled quickly.”
For each, produce: (1) a testable statement, (2) the exact data facts needed, and (3) any ambiguity you must clarify with stakeholders.
Practical output: a requirements-to-facts worksheet (template)
Use this template to convert raw statements into database-relevant facts.
Stakeholder statement: ________________________________ Actors: ____________________ Event(s): ____________________ Objects: ____________________ Candidate data items (nouns/attributes): ____________________ Relationships (verbs): ____________________ Process requirement (Y/N): ____ If process, what data does it depend on?: ____________________ Questions to answer: - ____________________ - ____________________ Scope notes / assumptions: ____________________ Ambiguous terms to clarify: ____________________