What Makes a Dimension “Good” for BI
A dimension table provides business context for measures in fact tables. A good dimension makes it easy for users to filter, group, and drill down without ambiguity. In practice, strong dimensions share three traits: rich descriptive attributes, stable keys, and intuitive hierarchies.
Rich descriptive attributes (the “language” of analysis)
Dimensions should contain the descriptive fields people naturally use in questions and report filters: names, categories, segments, statuses, flags, and derived groupings. The goal is to reduce the need for ad-hoc logic in reports by storing commonly used descriptors once, consistently.
- Good: Product has brand, category, subcategory, size, color, lifecycle status, and launch date.
- Weak: Product has only product_id and product_name, forcing analysts to join to multiple lookup tables or re-create groupings in every report.
Stable keys (the “anchor” for consistent joins)
Dimensions should be joined to facts using stable surrogate keys (typically integer keys generated in the warehouse). Natural keys from source systems (like customer_number or SKU) can change, be reused, or differ across systems. Surrogate keys protect reporting consistency and support change tracking in attributes over time.
- Surrogate key: customer_key (warehouse-generated)
- Business/natural key: customer_id (from CRM), may change or collide across sources
Intuitive hierarchies (the “paths” for slicing and drill-down)
Hierarchies are ordered levels that match how the business thinks and how users drill from summary to detail. Common patterns include:
- Product: Category → Subcategory → SKU
- Geography: Region → Country → State/Province → City
- Organization: Division → Department → Team
Hierarchies should be explicit in the dimension (separate columns for each level) and named clearly so users can find them quickly in BI tools.
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
Organizing Attributes for Usability
Group attributes into logical sections
Even though a dimension is a single table, you can design it to feel organized by grouping columns conceptually. This improves discoverability in semantic layers and reduces confusion for report authors.
- Identity: business key, name, display label, codes
- Classification: segment, category/subcategory, type, tier
- Geography: region, country, state, city, postal code
- Status and lifecycle: active flag, start/end dates, churn reason
- Operational metadata (optional): source system, record status, last updated timestamp
Naming conventions that match business language
Use names that are readable and consistent across dimensions. Avoid cryptic abbreviations and overloaded terms.
- Prefer customer_segment over seg_cd.
- Prefer product_subcategory over subcategory (more explicit in wide models).
- Use consistent suffixes: _key for surrogate keys, _id for business keys, _name for labels, _flag for booleans.
Design for “Unknown”, “Other”, and missing values
Facts often arrive with incomplete dimension references (late-arriving data, missing codes, new members not yet loaded). To keep reporting stable and avoid dropping rows in joins, include standard placeholder members in each dimension.
- Unknown: the source did not provide the value (e.g., customer not captured).
- Not applicable: the attribute does not apply (e.g., campaign for organic traffic).
- Other: value exists but is grouped for reporting (e.g., long tail of small channels).
Common practice is to reserve surrogate keys for these rows (for example, 0 = Unknown, -1 = Not applicable), but the exact values depend on your warehouse standards.
Make hierarchies consistent and drill-friendly
To support drill-down, store each hierarchy level as its own attribute and ensure values are complete and aligned. For example, if a product has a subcategory, it must also have a category. If a city is present, state and country should also be present.
- Use a single, consistent hierarchy per analysis purpose (avoid mixing marketing and merchandising hierarchies in the same set of columns unless clearly separated).
- Keep level names consistent: region, country, state_province, city.
- Provide both codes and names when users need them: country_code and country_name.
Step-by-Step: Designing a Dimension Table
Step 1: List the questions the dimension must answer
Start from how the business slices metrics. Write down the filters and groupings users expect.
- Customer: “Revenue by customer segment and region”, “Churn by acquisition channel”
- Product: “Margin by category and brand”, “Units sold by size and color”
- Campaign: “Conversions by channel and objective”, “ROI by audience and creative”
- Date: “Sales by fiscal period”, “Week-over-week growth”
Step 2: Define the natural key and choose a surrogate key
Identify the best business identifier(s) from source systems and store them as attributes, but join facts using a surrogate key.
- Customer: natural key might be CRM customer_id; store it as customer_id, join on customer_key.
- Product: natural key might be SKU; store as sku, join on product_key.
Step 3: Draft attributes and hierarchies
Include attributes that drive filtering and grouping. Add hierarchy levels explicitly and ensure they are business-approved.
- Product hierarchy: product_category, product_subcategory, sku
- Geography hierarchy: region, country, state_province
Step 4: Standardize values and handle exceptions
Decide how to format names (case, punctuation), how to handle blanks, and how to map rare values to “Other” when needed for reporting.
- Normalize country names (e.g., “United States” vs “USA”).
- Define allowed values for status fields (e.g., Active/Inactive).
- Create mapping rules for channel rollups (e.g., “Paid Social” includes multiple platform codes).
Step 5: Validate with real report scenarios
Test the dimension against typical BI actions: filtering, grouping, drill-down, and cross-report consistency.
- Can a user filter by customer_segment and get expected totals?
- Does drilling Product Category → Subcategory → SKU produce a sensible path?
- Do two reports using “Region” produce the same grouping?
Example: Customer Dimension (Filtering and Consistent Segmentation)
A Customer dimension should support segmentation, geography slicing, and lifecycle analysis. It often becomes a shared definition across many reports, so naming and standardization matter.
dim_customer (example columns) customer_key (PK) customer_id (business key) customer_name customer_type (B2B/B2C) customer_segment (e.g., Enterprise, SMB, Consumer) acquisition_channel (e.g., Paid Search, Partner, Organic) region country state_province city postal_code industry account_owner is_active_flag created_date unknown_member_flagHow it supports BI:
- Filtering: customer_segment = “SMB”, is_active_flag = true
- Grouping: revenue by industry, churn by acquisition_channel
- Drill-down: region → country → state_province
- Consistency: “Enterprise” means the same segment across all dashboards because it is defined once in the dimension.
Practical notes for Customer attributes
- Prefer a single “customer_name” used in reports; keep alternate names in separate columns if needed (legal_name, trading_name).
- If multiple systems define segments differently, store both (e.g., segment_crm, segment_bi_standard) and make the standard one the default for reporting.
- Include “Unknown” customer row to prevent losing fact rows when customer_id is missing.
Example: Product Dimension (Category Rollups and SKU Detail)
Product analysis typically requires both rollups (category performance) and detailed SKU-level views. The dimension should make those rollups obvious and stable.
dim_product (example columns) product_key (PK) sku (business key) product_name brand_name product_category product_subcategory product_line size color material package_type lifecycle_status (Active/Discontinued) launch_date is_bundle_flagHow it supports BI:
- Filtering: brand_name = “Acme”, lifecycle_status = “Active”
- Grouping: margin by product_category, units by size
- Drill-down: product_category → product_subcategory → sku
Practical notes for Product hierarchies
- Keep hierarchy levels business-friendly and stable; avoid using internal merchandising codes as the primary reporting fields unless paired with readable names.
- If a product can belong to multiple categories, decide on a primary reporting hierarchy or model a separate bridge; do not silently duplicate products in the dimension.
- Use “Other” subcategory when long-tail subcategories clutter reports, but keep the original subcategory in a separate attribute if analysts still need it.
Example: Campaign Dimension (Marketing Context and Attribution-Friendly Slicing)
A Campaign dimension provides context for marketing performance analysis. It should align with how marketing teams plan and evaluate initiatives: channel, objective, audience, and creative.
dim_campaign (example columns) campaign_key (PK) campaign_id (business key) campaign_name channel (Email, Paid Social, Search, Display, Partner) channel_group (Paid, Owned, Earned) objective (Acquisition, Retention, Awareness) audience_segment geo_target_region start_date end_date creative_theme landing_page_type is_evergreen_flag status (Planned/Active/Ended)How it supports BI:
- Filtering: channel_group = “Paid”, objective = “Acquisition”
- Grouping: conversions by channel, ROI by audience_segment
- Drill-down: channel_group → channel → campaign_name
Practical notes for Campaign usability
- Store both granular and rolled-up channel fields (channel and channel_group) to avoid repeated mapping logic in reports.
- Define “Not applicable” campaign for events that are not campaign-driven (e.g., organic signups) so facts can still join cleanly.
- Use consistent naming rules for campaign_name (or store a cleaned display name) so report filters are readable.
Example: Date Dimension (Time Intelligence Without Repeated Logic)
A Date dimension is a usability accelerator: it centralizes calendar and fiscal logic so every report uses the same definitions for weeks, months, quarters, and holidays.
dim_date (example columns) date_key (PK, e.g., 20260116) date (YYYY-MM-DD) day_of_week_name day_of_week_number week_start_date week_of_year month_number month_name month_start_date quarter_number year_number fiscal_month_number fiscal_quarter_number fiscal_year_number is_weekend_flag is_holiday_flagHow it supports BI:
- Filtering: fiscal_year_number = 2026, is_weekend_flag = false
- Grouping: sales by month_name, pipeline by fiscal_quarter_number
- Drill-down: year → quarter → month → date
- Consistency: “Week” and “Fiscal Quarter” mean the same thing across all dashboards because they come from one shared table.
Practical notes for Date design
- Include both calendar and fiscal attributes if the business uses both; keep names explicit (fiscal_year_number vs year_number).
- Use a numeric date_key for efficient joins and a date column for readability.
- Precompute commonly needed flags (is_month_end_flag, is_current_month_flag) only if they are stable and well-defined; avoid volatile “current” flags if your refresh cadence makes them confusing.
Making Dimensions Work Together Across Reports
Use shared definitions and avoid “report-only” logic
Dimensions are most valuable when they enforce consistent definitions across the organization. If a segmentation rule or channel rollup is important, encode it as a dimension attribute rather than re-implementing it in multiple reports.
- Standardize segment labels (e.g., Enterprise/SMB) in dim_customer.
- Standardize channel_group in dim_campaign.
- Standardize region/country naming in all dimensions that carry geography.
Design for discoverability in BI tools
Users browse fields. Make the “right” fields easy to find and hard to misuse.
- Prefer clear, business-facing names for default fields (product_category, customer_segment).
- Keep technical fields (source_system_code, raw_status_code) but separate them by naming so they are not mistaken for reporting fields.
- When two attributes sound similar, rename to clarify intent (e.g., customer_region vs sales_region if they differ).