Modeling Common Business Scenarios with Star Schemas: Sales, Marketing, and Operations

Capítulo 10

Estimated reading time: 9 minutes

+ Exercise

How to Apply Star Schemas to Common Business Scenarios

This chapter applies dimensional modeling through three end-to-end star schema examples: sales, marketing, and operations. Each example starts from business questions, states the grain explicitly, describes the schema, shows sample rows, and finishes with queries that aggregate correctly at the intended level.

Scenario 1: Sales Analytics (Order-Line Sales Fact)

Business questions to support

  • What is revenue, discount, and gross margin by product category and month?
  • Which channels drive the highest revenue per customer segment?
  • How do store performance and discounting vary by region and week?
  • What is average selling price (ASP) by product and channel?

Grain statement

One row per order line item (a specific product on a specific order), at the time the order line is booked (or shipped—choose one and keep it consistent). Each row represents the atomic event used for sales reporting.

Step-by-step: define the fact and dimensions

  • 1) Identify the business process: customer purchases (orders).
  • 2) Fix the grain: order-line grain (not order header; not daily summary).
  • 3) Choose dimensions that describe the line: Customer, Product, Date, Channel, Store.
  • 4) Choose measures captured at the same grain: quantity, revenue, discount amount, cost amount.
  • 5) Add degenerate identifiers (optional but common): OrderNumber, OrderLineNumber as text columns in the fact for drill-through.

Schema diagram description (star)

Central fact: Sales_Fact with foreign keys to Customer_Dim, Product_Dim, Date_Dim, Channel_Dim, and Store_Dim. Each dimension is a single table with descriptive attributes used for filtering, grouping, and hierarchies.

Customer_Dim   Product_Dim   Date_Dim   Channel_Dim   Store_Dim
      \            |           |           |            /
       \           |           |           |           /
                Sales_Fact (order-line grain)

Table sketch

Sales_Fact (example columns)

  • SalesFactKey (surrogate PK)
  • DateKey, CustomerKey, ProductKey, ChannelKey, StoreKey (FKs)
  • OrderNumber, OrderLineNumber (degenerate dimensions)
  • Quantity
  • RevenueAmount (extended price after line-level pricing rules, before discount if you track separately)
  • DiscountAmount
  • CostAmount

Dimension examples

Continue in our app.
  • Listen to the audio with the screen off.
  • Earn a certificate upon completion.
  • Over 5000 courses for you to explore!
Or continue reading below...
Download App

Download the app

  • Customer_Dim: CustomerName, Segment, Region, SignupDate, etc.
  • Product_Dim: SKU, ProductName, Brand, Category, Subcategory, etc.
  • Date_Dim: Date, Week, Month, Quarter, Year, FiscalMonth, etc.
  • Channel_Dim: ChannelName (Online, Retail, Partner), ChannelGroup, etc.
  • Store_Dim: StoreName, City, State, Region, StoreType, etc.

Sample rows

Sales_Fact

| SalesFactKey | DateKey  | CustomerKey | ProductKey | ChannelKey | StoreKey | OrderNumber | OrderLineNumber | Quantity | RevenueAmount | DiscountAmount | CostAmount |
|-------------:|---------:|------------:|-----------:|-----------:|---------:|------------|----------------:|---------:|--------------:|---------------:|-----------:|
| 100001       | 20250105 | 5012        | 20031      | 1          | 110      | SO-77821   | 1               | 2        | 80.00         | 10.00          | 44.00      |
| 100002       | 20250105 | 5012        | 20044      | 1          | 110      | SO-77821   | 2               | 1        | 25.00         | 0.00           | 14.00      |
| 100003       | 20250106 | 6120        | 20031      | 2          | 205      | SO-77855   | 1               | 3        | 120.00        | 15.00          | 66.00      |

Product_Dim

| ProductKey | SKU     | ProductName     | Brand  | Category   | Subcategory |
|----------:|---------|-----------------|--------|------------|------------|
| 20031     | SKU-31  | Trail Bottle 1L | Acme   | Outdoors   | Hydration  |
| 20044     | SKU-44  | Energy Bar Box  | Acme   | Nutrition  | Snacks     |

Example queries (correct aggregations)

1) Revenue, discount, cost, and gross margin by month and category

SELECT
  d.Year,
  d.Month,
  p.Category,
  SUM(f.RevenueAmount) AS Revenue,
  SUM(f.DiscountAmount) AS Discount,
  SUM(f.CostAmount) AS Cost,
  SUM(f.RevenueAmount - f.DiscountAmount - f.CostAmount) AS GrossMargin
FROM Sales_Fact f
JOIN Date_Dim d    ON f.DateKey = d.DateKey
JOIN Product_Dim p ON f.ProductKey = p.ProductKey
GROUP BY d.Year, d.Month, p.Category
ORDER BY d.Year, d.Month, p.Category;

2) Average selling price (ASP) by channel

SELECT
  c.ChannelName,
  SUM(f.RevenueAmount - f.DiscountAmount) / NULLIF(SUM(f.Quantity), 0) AS ASP
FROM Sales_Fact f
JOIN Channel_Dim c ON f.ChannelKey = c.ChannelKey
GROUP BY c.ChannelName
ORDER BY ASP DESC;

3) Store performance: avoid double counting by aggregating only fact measures

SELECT
  s.Region,
  s.StoreName,
  SUM(f.RevenueAmount - f.DiscountAmount) AS NetRevenue
FROM Sales_Fact f
JOIN Store_Dim s ON f.StoreKey = s.StoreKey
GROUP BY s.Region, s.StoreName
ORDER BY NetRevenue DESC;

Scenario 2: Marketing Analytics (Ad Performance Fact)

Business questions to support

  • How many impressions, clicks, and spend did each campaign generate by day?
  • Which creatives have the best click-through rate (CTR) for a given audience?
  • What is cost per acquisition (CPA) by campaign and week?
  • How does performance differ by audience segment across channels?

Grain statement

One row per campaign–creative–audience–date (and optionally per placement/publisher if needed). This is a common grain for ad platforms where metrics are reported as daily aggregates per entity combination.

Step-by-step: define the fact and dimensions

  • 1) Identify the business process: ad delivery and engagement.
  • 2) Fix the grain: daily performance at the intersection of Campaign, Creative, Audience, and Date.
  • 3) Choose dimensions: Campaign, Creative, Audience, Date (and optionally Channel/Platform if not embedded in Campaign).
  • 4) Choose base measures: impressions, clicks, spend, conversions (if available).
  • 5) Define derived metrics: CTR, CPC, CPA computed in queries/semantic layer (not stored as additive measures).

Schema diagram description (star)

Central fact: Ad_Performance_Fact with foreign keys to Campaign_Dim, Creative_Dim, Audience_Dim, and Date_Dim. Measures are daily totals for that combination.

Campaign_Dim   Creative_Dim   Audience_Dim   Date_Dim
      \            |              |            /
       \           |              |           /
              Ad_Performance_Fact (daily grain)

Table sketch

Ad_Performance_Fact (example columns)

  • DateKey, CampaignKey, CreativeKey, AudienceKey (FKs)
  • Impressions
  • Clicks
  • SpendAmount
  • Conversions (if tracked; define conversion type consistently)

Derived metrics (computed)

  • CTR = Clicks / Impressions
  • CPC = SpendAmount / Clicks
  • CPA = SpendAmount / Conversions

Sample rows

| DateKey  | CampaignKey | CreativeKey | AudienceKey | Impressions | Clicks | SpendAmount | Conversions |
|---------:|------------:|------------:|------------:|------------:|-------:|------------:|------------:|
| 20250105 | 9001        | 300         | 700         | 120000      | 2400   | 1800.00     | 120         |
| 20250105 | 9001        | 301         | 700         | 80000       | 1200   | 950.00      | 60          |
| 20250106 | 9002        | 305         | 710         | 50000       | 600    | 700.00      | 20          |

Example queries (correct aggregations)

1) Daily campaign rollup with CTR and CPC (ratio of sums, not average of ratios)

SELECT
  d.Date,
  c.CampaignName,
  SUM(f.Impressions) AS Impressions,
  SUM(f.Clicks) AS Clicks,
  SUM(f.SpendAmount) AS Spend,
  SUM(f.Clicks) / NULLIF(SUM(f.Impressions), 0) AS CTR,
  SUM(f.SpendAmount) / NULLIF(SUM(f.Clicks), 0) AS CPC
FROM Ad_Performance_Fact f
JOIN Date_Dim d      ON f.DateKey = d.DateKey
JOIN Campaign_Dim c  ON f.CampaignKey = c.CampaignKey
GROUP BY d.Date, c.CampaignName
ORDER BY d.Date, c.CampaignName;

2) Creative performance by audience segment (correctly aggregated)

SELECT
  a.AudienceSegment,
  cr.CreativeName,
  SUM(f.Impressions) AS Impressions,
  SUM(f.Clicks) AS Clicks,
  SUM(f.Clicks) / NULLIF(SUM(f.Impressions), 0) AS CTR
FROM Ad_Performance_Fact f
JOIN Audience_Dim a ON f.AudienceKey = a.AudienceKey
JOIN Creative_Dim cr ON f.CreativeKey = cr.CreativeKey
GROUP BY a.AudienceSegment, cr.CreativeName
HAVING SUM(f.Impressions) >= 10000
ORDER BY CTR DESC;

3) CPA by week and campaign (requires conversions)

SELECT
  d.Year,
  d.Week,
  c.CampaignName,
  SUM(f.SpendAmount) AS Spend,
  SUM(f.Conversions) AS Conversions,
  SUM(f.SpendAmount) / NULLIF(SUM(f.Conversions), 0) AS CPA
FROM Ad_Performance_Fact f
JOIN Date_Dim d     ON f.DateKey = d.DateKey
JOIN Campaign_Dim c ON f.CampaignKey = c.CampaignKey
GROUP BY d.Year, d.Week, c.CampaignName
ORDER BY d.Year, d.Week, c.CampaignName;

Scenario 3: Operations Analytics (Shipments or Inventory Snapshots)

Pick the right operational fact type

Operations commonly needs two different analytical views: event facts (shipments, receipts, picks) and snapshot facts (inventory on hand at a point in time). The modeling choice affects how measures aggregate over time.

  • Shipment_Fact: one row per shipment line event; measures like shipped quantity are additive across time.
  • Inventory_Snapshot_Fact: one row per product-location-day (or week/month) snapshot; measures like on-hand quantity are semi-additive (additive across product/location, not across time).

Business questions to support

  • Shipments: How many units shipped by warehouse and day? What is average delivery time by carrier?
  • Inventory: What was on-hand inventory by day and warehouse? What is average on-hand over a month? How many days of supply do we have?

Option A: Shipment_Fact (event-based)

Grain statement

One row per shipment line (a specific product shipped from a specific warehouse to a destination on a ship date). If partial shipments occur, each shipment line is its own row.

Schema diagram description (star)

Central fact: Shipment_Fact with foreign keys to Date_Dim (ship date), Product_Dim, Warehouse_Dim, Carrier_Dim, and optionally Customer_Dim (destination) or Destination_Dim.

Product_Dim  Warehouse_Dim  Carrier_Dim  Date_Dim  (Destination/Customer_Dim)
     \           |             |           |                 /
      \          |             |           |                /
                     Shipment_Fact (shipment-line grain)

Sample rows

| ShipDateKey | ProductKey | WarehouseKey | CarrierKey | DestinationKey | ShipmentNumber | ShipmentLine | ShippedQty | FreightAmount |
|------------:|-----------:|-------------:|-----------:|---------------:|----------------|-------------:|-----------:|--------------:|
| 20250105    | 20031      | 10           | 3          | 5012           | SH-90011       | 1            | 2          | 6.50          |
| 20250105    | 20044      | 10           | 3          | 5012           | SH-90011       | 2            | 1          | 2.75          |
| 20250106    | 20031      | 12           | 2          | 6120           | SH-90045       | 1            | 3          | 8.10          |

Example queries (correct aggregations)

1) Units shipped by day and warehouse

SELECT
  d.Date,
  w.WarehouseName,
  SUM(f.ShippedQty) AS UnitsShipped
FROM Shipment_Fact f
JOIN Date_Dim d       ON f.ShipDateKey = d.DateKey
JOIN Warehouse_Dim w  ON f.WarehouseKey = w.WarehouseKey
GROUP BY d.Date, w.WarehouseName
ORDER BY d.Date, w.WarehouseName;

2) Freight per unit by carrier (ratio of sums)

SELECT
  c.CarrierName,
  SUM(f.FreightAmount) AS Freight,
  SUM(f.ShippedQty) AS Units,
  SUM(f.FreightAmount) / NULLIF(SUM(f.ShippedQty), 0) AS FreightPerUnit
FROM Shipment_Fact f
JOIN Carrier_Dim c ON f.CarrierKey = c.CarrierKey
GROUP BY c.CarrierName
ORDER BY FreightPerUnit DESC;

Option B: Inventory_Snapshot_Fact (semi-additive over time)

Grain statement

One row per product–warehouse–date snapshot representing inventory position at end-of-day (or start-of-day; choose one). This is a periodic snapshot used to analyze stock levels and trends.

Schema diagram description (star)

Central fact: Inventory_Snapshot_Fact with foreign keys to Date_Dim, Product_Dim, and Warehouse_Dim. Measures include on-hand quantity and inventory value. These measures are additive across product and warehouse, but not across time.

Product_Dim   Warehouse_Dim   Date_Dim
     \            |           /
      \           |          /
        Inventory_Snapshot_Fact (daily snapshot grain)

Sample rows

| DateKey  | ProductKey | WarehouseKey | OnHandQty | OnHandValue |
|---------:|-----------:|-------------:|----------:|------------:|
| 20250105 | 20031      | 10           | 120       | 2640.00     |
| 20250106 | 20031      | 10           | 115       | 2530.00     |
| 20250105 | 20044      | 10           | 80        | 1120.00     |

Semi-additive measures: what to do in queries

  • Do not sum OnHandQty across dates to answer “how much inventory did we have in January?” That would multiply the same stock position by the number of days.
  • Use ending balance (last snapshot in period), average balance (average of daily snapshots), or min/max depending on the question.

Example queries (correct aggregations)

1) Ending on-hand by month (use last date in month per product/warehouse)

WITH MonthLastDate AS (
  SELECT
    d.Year,
    d.Month,
    MAX(d.DateKey) AS LastDateKey
  FROM Date_Dim d
  GROUP BY d.Year, d.Month
)
SELECT
  m.Year,
  m.Month,
  w.WarehouseName,
  p.Category,
  SUM(f.OnHandQty) AS EndingOnHandQty
FROM MonthLastDate m
JOIN Inventory_Snapshot_Fact f
  ON f.DateKey = m.LastDateKey
JOIN Warehouse_Dim w ON f.WarehouseKey = w.WarehouseKey
JOIN Product_Dim p   ON f.ProductKey = p.ProductKey
GROUP BY m.Year, m.Month, w.WarehouseName, p.Category
ORDER BY m.Year, m.Month, w.WarehouseName, p.Category;

2) Average daily on-hand over a month (average of snapshots)

SELECT
  d.Year,
  d.Month,
  w.WarehouseName,
  p.Category,
  AVG(f.OnHandQty) AS AvgDailyOnHandQty
FROM Inventory_Snapshot_Fact f
JOIN Date_Dim d      ON f.DateKey = d.DateKey
JOIN Warehouse_Dim w ON f.WarehouseKey = w.WarehouseKey
JOIN Product_Dim p   ON f.ProductKey = p.ProductKey
GROUP BY d.Year, d.Month, w.WarehouseName, p.Category
ORDER BY d.Year, d.Month, w.WarehouseName, p.Category;

3) Point-in-time inventory by day (additive across product/warehouse at that date)

SELECT
  d.Date,
  w.WarehouseName,
  SUM(f.OnHandQty) AS OnHandQty
FROM Inventory_Snapshot_Fact f
JOIN Date_Dim d      ON f.DateKey = d.DateKey
JOIN Warehouse_Dim w ON f.WarehouseKey = w.WarehouseKey
WHERE d.Date = DATE '2025-01-06'
GROUP BY d.Date, w.WarehouseName
ORDER BY w.WarehouseName;

Now answer the exercise about the content:

When modeling inventory with a daily Inventory_Snapshot_Fact, what is the correct way to answer “how much inventory did we have in January?” without overstating the result?

You are right! Congratulations, now go to the next page

You missed! Try again.

OnHandQty in a snapshot fact is semi-additive: it can be summed across products/warehouses but not across time. For a month, use the last snapshot (ending balance) or an average/min/max of daily snapshots, not a sum across dates.

Next chapter

Improving Performance and Trust: Modeling Practices That Keep Reporting Consistent

Arrow Right Icon
Free Ebook cover Data Modeling Fundamentals for BI: Star Schemas, Dimensions, and Facts
91%

Data Modeling Fundamentals for BI: Star Schemas, Dimensions, and Facts

New course

11 pages

Download the app to earn free Certification and listen to the courses in the background, even with the screen off.