What a Slowly Changing Dimension (SCD) Solves
In analytics, dimension attributes are not always stable. A customer can move from “Prospect” to “Enterprise,” a sales rep can be reassigned to a new territory, and a product can be reclassified into a different category. If you simply update the dimension row, historical reports can silently change because old facts will now join to the new attribute values.
A Slowly Changing Dimension (SCD) is a set of patterns for managing attribute changes in a dimension so you can choose the reporting outcome you want: preserve history, ignore history, or keep limited history. The key decision is whether past facts should be reported using the attribute value that was true at the time of the event.
Common SCD Types and Their Reporting Outcomes
Type 1: Overwrite (No History)
Type 1 updates the dimension attribute in place. There is only one row per business entity, and the latest value is used for all reporting, including historical facts.
- When it’s useful: correcting data errors (typos, formatting), or attributes where history is not meaningful for analysis.
- Outcome: historical reports “shift” to the newest attribute value.
- Example: fixing a customer’s misspelled company name; you want all past reports to show the corrected name.
Type 2: Add History Rows (Full History)
Type 2 creates a new dimension row when tracked attributes change. Each version of the entity is stored with effective dates (and often a current flag). Facts link to the correct version so reports reflect the attribute values that were true at the time of the event.
- When it’s useful: customer segment changes, territory realignments, product category changes, any attribute where “as-was” reporting matters.
- Outcome: historical facts remain tied to historical attributes; new facts use the new attributes.
Type 0: Fixed Attribute (No Changes Allowed)
Type 0 treats an attribute as immutable after initial load. If the source changes later, you ignore it for that attribute.
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
- When it’s useful: attributes that should never change for reporting purposes (e.g., “original acquisition channel” if you define it as the first-touch channel).
- Outcome: reports always show the original value, even if the operational system changes it.
Type 3: Limited History (Previous Value)
Type 3 stores the current value and one (or a small number) of prior values in additional columns (e.g., CurrentTerritory, PreviousTerritory). It does not provide full history across many changes.
- When it’s useful: you only need “before vs after” analysis around the most recent change.
- Outcome: you can report current and immediate prior values, but not a full timeline.
Type 2 in Practice: A Step-by-Step Pattern
Type 2 is the most common approach when you need stable historical reporting. The pattern below assumes you track changes for a defined set of attributes (e.g., Segment, Territory, Category) and treat other attributes as Type 1 or non-tracked.
1) Decide Which Attributes Are Type 2 Tracked
Not every attribute needs history. Choose attributes that materially affect analysis. For example:
- Marketing: CustomerSegment, LifecycleStage, AccountTier
- Sales: SalesTerritory, Region, AssignedRep
- Product: ProductCategory, BrandFamily
Document the expected reporting behavior: “Orders should be attributed to the customer segment at order date” or “Revenue should follow territory at time of booking.”
2) Detect Changes Against the Current Dimension Version
During each load, compare the incoming source attributes to the current active dimension row for that business entity. Typical approaches:
- Direct column comparison: compare each tracked attribute.
- Hash comparison: compute a hash of tracked attributes in source and compare to stored hash for the current row.
Only create a new Type 2 row when a tracked attribute changes. Non-tracked attribute changes can be handled as Type 1 updates on the current row (if desired).
-- Pseudocode concept: identify changed entities (Type 2 attributes only)SELECT s.BusinessIdFROM Staging sJOIN DimEntity d ON d.BusinessId = s.BusinessId AND d.IsCurrent = 1WHERE (s.Segment <> d.Segment OR s.Territory <> d.Territory);3) Expire the Current Record
When a change is detected, “close” the current row by setting its expiry date and current flag.
- EffectiveDate: when this version became valid
- ExpiryDate: when this version stopped being valid
- IsCurrent: 1 for active row, 0 for expired rows
Use a consistent rule for dates. A common approach is:
- New version EffectiveDate = change date (or load date if that’s all you have)
- Old version ExpiryDate = new EffectiveDate minus 1 day (for date-grain validity) or equal to new EffectiveDate (for timestamp-grain validity)
-- Expire current row (date-grain example)UPDATE DimCustomerSET ExpiryDate = @ChangeDate - 1, IsCurrent = 0WHERE BusinessCustomerId = @BusinessCustomerId AND IsCurrent = 1;4) Insert the New Version Row
Insert a new row with the updated tracked attributes and mark it as current. Set its EffectiveDate and a high ExpiryDate (or NULL) to represent “open-ended.”
INSERT INTO DimCustomer (BusinessCustomerId, Segment, Territory, EffectiveDate, ExpiryDate, IsCurrent)VALUES (@BusinessCustomerId, @NewSegment, @NewTerritory, @ChangeDate, '9999-12-31', 1);5) Manage Effective/Expiry Dates Carefully
Date logic is where many SCD implementations break reporting. Decide and enforce:
- Date vs timestamp: if facts have timestamps, consider timestamp effective/expiry to avoid same-day ambiguity.
- Change date source: prefer a reliable “valid from” date from the source; otherwise use load date and accept that changes are recognized when loaded.
- No overlaps, no gaps (for the same entity): ensure each entity has exactly one current row and historical ranges do not overlap.
Validation checks you can run after loads:
- Exactly one IsCurrent=1 row per business entity
- No overlapping effective/expiry ranges per business entity
- ExpiryDate >= EffectiveDate for all rows
6) Link Facts to the Correct Historical Dimension Row
To preserve history, each fact row must join to the dimension version that was valid at the fact’s event date/time. There are two common strategies:
- Lookup at load time (recommended): when loading facts, find the correct dimension row and store its identifier in the fact. This makes reporting fast and consistent.
- As-of join at query time: join facts to dimensions using event date between EffectiveDate and ExpiryDate. This is flexible but can be slower and more complex.
Load-time lookup example (conceptual):
-- Find the correct dimension version for each fact by event dateSELECT f.StagingFactId, d.DimCustomerIdFROM StagingFact fJOIN DimCustomer d ON d.BusinessCustomerId = f.BusinessCustomerId AND f.EventDate BETWEEN d.EffectiveDate AND d.ExpiryDate;Once the correct dimension row is found, store that dimension row identifier in the fact record. This ensures that when the customer changes segment later, old facts still point to the older segment version.
Worked Example 1: Marketing Segment Changes
Scenario: Marketing assigns each customer a segment used for campaign performance reporting. A customer can move between segments over time (e.g., “SMB” to “Mid-Market”). You want historical campaign responses to remain attributed to the segment at the time of the response.
Dimension Versions
- Customer A is “SMB” from 2025-01-01 to 2025-03-14
- Customer A becomes “Mid-Market” starting 2025-03-15
DimCustomer (simplified)DimCustomerId | BusinessCustomerId | Segment | EffectiveDate | ExpiryDate | IsCurrent101 | A | SMB | 2025-01-01 | 2025-03-14 | 0102 | A | Mid-Market | 2025-03-15 | 9999-12-31 | 1Fact Linking Outcome
CampaignResponse facts on 2025-02-10 link to DimCustomerId 101 (SMB). Responses on 2025-04-01 link to DimCustomerId 102 (Mid-Market). Reports by segment over time remain stable and reflect the segment at response time.
Worked Example 2: Sales Territory Realignment
Scenario: Sales leadership realigns territories quarterly. You need bookings and pipeline to be attributed to the territory assignment at the time of booking, not the current territory.
What Changes and Why Type 2 Helps
If you overwrite territory (Type 1), all historical bookings appear under the new territory, which can distort trend analysis and compensation reporting. With Type 2, each territory assignment becomes a new dimension version.
Step-by-Step Realignment Flow
- Detect: Customer B territory changes from “West” to “Central” effective 2025-07-01.
- Expire: close the “West” row with ExpiryDate = 2025-06-30.
- Insert: add a new “Central” row effective 2025-07-01 and mark current.
- Fact load: bookings dated 2025-06-15 link to the “West” version; bookings dated 2025-07-10 link to the “Central” version.
DimAccount (simplified)DimAccountId | BusinessAccountId | Territory | EffectiveDate | ExpiryDate | IsCurrent201 | B | West | 2025-01-01 | 2025-06-30 | 0202 | B | Central | 2025-07-01 | 9999-12-31 | 1Operational Tips and Common Pitfalls
Choose the Right “Change Date”
If the source provides an “effective from” date for segment/territory, use it. If not, you are effectively modeling “when the warehouse learned about the change,” which may be acceptable but should be understood by report consumers.
Handle Multiple Changes in a Single Load Window
If an entity can change multiple times between loads, you may receive only the latest state. Without a change log, you cannot reconstruct intermediate history. If intermediate history matters, capture change events from the source or implement a snapshot/change-data feed.
Be Explicit About Which Attributes Are Type 1 vs Type 2
Mixing approaches is normal. For example, you might track Segment as Type 2 but treat EmailAddress as Type 1 (always show the latest). Make this explicit so users understand why some attributes are “as-was” and others are “as-is.”
Ensure Facts Use the Correct Version
The most common reporting break is when facts are linked to the current dimension row instead of the correct historical row. If you see historical metrics shifting after a dimension change, verify the fact load lookup logic and the effective/expiry ranges.