Core Concepts of Dimensional Modeling: Facts, Dimensions, and Business Processes

Capítulo 2

Estimated reading time: 8 minutes

+ Exercise

Facts: Measurable Events You Want to Analyze

A fact is a measurable result of something that happened in the business. Facts usually come from events such as a purchase, a click, a shipment, or a support ticket being opened. In BI, facts are the numbers you add up, average, compare over time, or slice by different categories.

Facts are typically stored in a fact table, where each row represents an event (or a summarized set of events) at a defined level of detail. Common examples of facts include:

  • Sales: revenue, quantity sold, discount amount, tax amount, cost, margin
  • Marketing: impressions, clicks, spend, conversions, attributed revenue
  • Operations: units shipped, shipping cost, delivery time, defect count, downtime minutes

Plain-language test for a fact

  • Can you put it in a chart as a number?
  • Would someone ask “What is the total/average/min/max of it?”
  • Does it change when the business event happens?

Dimensions: Descriptive Context That Explains the Numbers

A dimension is descriptive information that provides context for facts. Dimensions answer questions like “Which product?”, “Which customer segment?”, “Which campaign?”, “Which warehouse?”, “Which day?” They are used to filter, group, and label measures in reports.

Dimensions are typically stored in dimension tables. Each dimension row represents a descriptive entity (a product, a customer, a campaign, a location, a date). Dimensions contain attributes that users recognize and want to slice by.

Examples of dimensions and attributes:

Continue in our app.
  • Listen to the audio with the screen off.
  • Earn a certificate upon completion.
  • Over 5000 courses for you to explore!
Or continue reading below...
Download App

Download the app

  • Product dimension: SKU, product name, brand, category, size, color
  • Customer dimension: customer ID, segment, region, acquisition channel, loyalty tier
  • Campaign dimension: campaign name, channel (search/social/email), objective, targeting type
  • Warehouse/Facility dimension: facility name, city, capacity band, operator
  • Date dimension: date, week, month, quarter, fiscal period, holiday flag

Plain-language test for a dimension

  • Is it a “by” word (sales by product, clicks by campaign)?
  • Would users want to filter on it?
  • Does it describe “who/what/where/when/how” around an event?

Modeling Around a Business Process

Dimensional modeling starts by choosing a business process: a repeatable set of activities that generates measurable events. Examples:

  • Orders (sales): customers place orders containing items
  • Ad clicks (marketing): users view ads and click through
  • Shipments (operations): orders are packed and shipped

Why start with a business process? Because analytics questions usually align to a process. When you pick a process, you can define:

  • What happened (facts/measures)
  • In what context (dimensions/attributes)
  • At what level of detail (the grain of the fact table, e.g., one row per order line, one row per click, one row per shipment)

How Analytics Questions Map to Measures and Dimensions

Most BI questions can be translated into: measure(s) + dimension(s) + filters + time. For example:

  • “What was revenue by product category and month in EMEA?”
  • “What is the conversion rate by campaign and device type last week?”
  • “What is the average delivery time by carrier and warehouse for express shipments?”

When you can consistently translate questions this way, you can design tables that make those questions easy to answer.

Structured Walkthrough: From Business Questions to Candidate Tables

Use the same repeatable steps for each business process: (1) identify questions, (2) list measures, (3) list descriptive attributes, (4) map to candidate fact and dimension tables.

Walkthrough A: Sales Process (Orders)

Step 1: Identify business questions

  • How much revenue did we generate by month and product category?
  • What is the average order value by customer segment?
  • Which regions have the highest discount rate?
  • How many units did we sell by channel (online vs retail)?

Step 2: List the measures (facts)

  • Order line revenue
  • Quantity sold
  • Discount amount
  • Tax amount
  • Cost amount (if available)
  • Order count (often derived as a count of distinct orders)

Step 3: List the descriptive attributes (dimensions)

  • Date/time: order date, month, quarter, fiscal period
  • Product: SKU, name, brand, category
  • Customer: segment, region, loyalty tier
  • Sales channel: online/retail/partner
  • Store (if retail): store name, city
  • Promotion: promo code, promo type
  • Geography: country, state, city (sometimes part of customer or store)

Step 4: Map into candidate fact and dimension tables

First, choose the grain. For orders, a common grain is one row per order line item (each product on an order). That supports product-level analysis and can still roll up to order-level metrics.

Candidate Fact Table: FactOrderLine (grain: one row per order_id + line_number) Columns (examples): - order_id (degenerate identifier) - order_line_number - order_date_key - product_key - customer_key - channel_key - store_key (nullable) - promotion_key (nullable) Measures: - quantity - gross_revenue - discount_amount - net_revenue - tax_amount - cost_amount
Candidate Dimension Tables: DimDate: date_key, calendar_date, week, month, quarter, year, fiscal_month, is_holiday DimProduct: product_key, sku, product_name, brand, category, subcategory, size, color DimCustomer: customer_key, customer_id, segment, region, loyalty_tier, acquisition_channel DimChannel: channel_key, channel_name DimStore: store_key, store_id, store_name, city, state, country DimPromotion: promotion_key, promo_code, promo_type, campaign_name

Notice how each business question becomes a combination of measures from the fact table and attributes from dimensions. Example mapping:

  • “Revenue by month and product category” → measure: net_revenue; dimensions: DimDate.month, DimProduct.category
  • “Average order value by customer segment” → measure: net_revenue aggregated to order_id then averaged; dimension: DimCustomer.segment

Walkthrough B: Marketing Process (Ad Clicks)

Step 1: Identify business questions

  • How many clicks and conversions did each campaign generate by day?
  • What is cost per click (CPC) by channel and device type?
  • Which audience segments have the best conversion rate?
  • How does performance differ by geography?

Step 2: List the measures (facts)

  • Impressions
  • Clicks
  • Spend
  • Conversions
  • Attributed revenue (if your attribution model provides it)

Step 3: List the descriptive attributes (dimensions)

  • Date/time: click date, hour (if needed)
  • Campaign: campaign name, objective, status
  • Channel: search/social/display/email
  • Creative: creative name, format, message variant
  • Device: device type, OS, browser
  • Audience: segment name, targeting type
  • Geography: country/region/city

Step 4: Map into candidate fact and dimension tables

Choose the grain based on how data arrives. Many marketing platforms provide data aggregated by day, campaign, ad group, creative, and device. A practical grain is one row per day + campaign + creative + device + geography (adjust to match your source).

Candidate Fact Table: FactAdPerformance (grain: day + campaign + creative + device + geo) Columns (examples): - date_key - campaign_key - creative_key - device_key - geo_key Measures: - impressions - clicks - spend - conversions - attributed_revenue
Candidate Dimension Tables: DimCampaign: campaign_key, campaign_id, campaign_name, channel, objective, status DimCreative: creative_key, creative_id, creative_name, format, variant DimDevice: device_key, device_type, os, browser DimGeo: geo_key, country, region, city DimDate: date_key, calendar_date, week, month, quarter, year

Example mapping:

  • “CPC by channel and device type” → measures: spend, clicks; dimensions: DimCampaign.channel, DimDevice.device_type
  • “Conversion rate by audience segment” → measures: conversions, clicks (or impressions); dimension: DimAudience.segment (add DimAudience if it’s a stable descriptive entity in your data)

Walkthrough C: Operations Process (Shipments)

Step 1: Identify business questions

  • What is the on-time delivery rate by carrier and warehouse?
  • What is average delivery time by shipping method?
  • How many shipments were delayed, and why?
  • What is shipping cost by region and product category?

Step 2: List the measures (facts)

  • Shipment count (rows)
  • Shipping cost
  • Delivery time (e.g., days from ship date to delivery date)
  • On-time flag (often used to compute on-time rate)
  • Units shipped (if available at shipment line level)

Step 3: List the descriptive attributes (dimensions)

  • Date/time: ship date, delivery date
  • Carrier: carrier name, service level
  • Warehouse: facility, city, region
  • Shipping method: ground/air/express
  • Destination geography: country/region/city
  • Delay reason: weather, capacity, address issue (if captured)
  • Product (optional): category, if shipment is at line level

Step 4: Map into candidate fact and dimension tables

Pick the grain based on what you need to analyze. If you need product-level shipping cost allocation, you may model shipment line. If not, one row per shipment is often sufficient.

Candidate Fact Table: FactShipment (grain: one row per shipment_id) Columns (examples): - shipment_id (degenerate identifier) - ship_date_key - delivery_date_key - carrier_key - warehouse_key - ship_method_key - destination_geo_key - delay_reason_key (nullable) Measures: - shipping_cost - delivery_days - on_time_flag
Candidate Dimension Tables: DimCarrier: carrier_key, carrier_name, service_level DimWarehouse: warehouse_key, warehouse_id, warehouse_name, city, region DimShipMethod: ship_method_key, method_name DimGeo: geo_key, country, region, city DimDelayReason: delay_reason_key, reason_category, reason_detail DimDate: date_key, calendar_date, week, month, quarter, year

Example mapping:

  • “On-time delivery rate by carrier and warehouse” → measure: on_time_flag aggregated as average (or sum(on_time)/count); dimensions: DimCarrier.carrier_name, DimWarehouse.warehouse_name
  • “Average delivery time by shipping method” → measure: delivery_days averaged; dimension: DimShipMethod.method_name

A Repeatable Template You Can Apply to Any Process

When you start a new subject area, use this checklist to avoid designing tables from columns alone:

1) Write the questions first

  • List 5–15 questions stakeholders ask repeatedly.
  • Rewrite each question in the form: measure(s) by dimension(s) over time.

2) Extract measures

  • Underline the numbers: totals, averages, rates, counts.
  • Decide which are stored measures (e.g., spend) vs derived measures (e.g., conversion rate = conversions/clicks).

3) Extract descriptive attributes

  • Underline the “by” fields: product, customer segment, channel, facility, carrier, geography, date.
  • Group attributes into entities that belong together (product attributes together, customer attributes together).

4) Propose candidate tables

  • Define the fact table grain in one sentence (e.g., “one row per order line”).
  • Put measures in the fact table.
  • Put descriptive attributes in dimension tables.
  • Connect the fact to dimensions using keys (e.g., product_key, date_key).

This workflow keeps the model aligned to how the business thinks and how analysts ask questions: numbers explained by context.

Now answer the exercise about the content:

In dimensional modeling, why is it recommended to start by choosing a business process (such as Orders, Ad Clicks, or Shipments) before designing fact and dimension tables?

You are right! Congratulations, now go to the next page

You missed! Try again.

Starting with a business process aligns the model to real analytics questions and clarifies the measures to store, the dimensions that describe them, and the fact table grain (e.g., one row per order line or per shipment).

Next chapter

Choosing the Grain: The Foundation of Reliable Metrics in Fact Tables

Arrow Right Icon
Free Ebook cover Data Modeling Fundamentals for BI: Star Schemas, Dimensions, and Facts
18%

Data Modeling Fundamentals for BI: Star Schemas, Dimensions, and Facts

New course

11 pages

Download the app to earn free Certification and listen to the courses in the background, even with the screen off.