Power BI Essentials: Combining Data with Merge and Append

Capítulo 5

Estimated reading time: 8 minutes

+ Exercise

Why Combine Data in Power Query?

In Power BI, you often need a single, analysis-ready table even when the source data arrives in pieces: separate monthly exports, multiple regions, or a fact table that needs descriptive attributes from lookup tables. Power Query gives you two primary tools to build a unified dataset:

  • Append: stacks tables with the same structure (adds rows).
  • Merge: joins related tables using keys (adds columns).

A reliable workflow is: append similar tables first to create one consolidated fact table, then merge in related attributes (customers, products, calendar, etc.).

Append: Stack Similar Tables (Monthly/Regional Files)

When to Use Append

Use append when tables represent the same “kind of thing” (same grain), such as:

  • Sales transactions for January, February, March (same columns).
  • Tickets from multiple regions where each region exports the same schema.
  • Survey responses collected in batches with consistent fields.

Append does not match rows by keys; it simply places one table under another. That means column consistency is critical.

Step-by-Step: Append Monthly Sales Tables

Scenario: You have queries Sales_2025_01, Sales_2025_02, Sales_2025_03 with columns: OrderID, OrderDate, CustomerID, ProductID, Quantity, UnitPrice.

Continue in our app.
  • Listen to the audio with the screen off.
  • Earn a certificate upon completion.
  • Over 5000 courses for you to explore!
Or continue reading below...
Download App

Download the app

  • In Power Query, select one of the monthly queries (or go to the Home tab).
  • Choose Append Queries (or Append Queries as New to create a new consolidated query).
  • Select Three or more tables, add all monthly tables, and confirm.
  • Name the result something like Sales_AllMonths.

Common Append Issues (and How to Prevent Them)

  • Missing columns in some months: Power Query will create nulls for missing fields. Decide whether that is acceptable or whether you need to add the missing column to those months before appending.
  • Same column, different name (e.g., CustID vs CustomerID): standardize column names in each monthly query before appending.
  • Same column, different meaning: do not append if the grain differs (e.g., one file is order lines, another is order headers). Fix the source or model separately.
  • Unexpected extra columns: remove irrelevant columns prior to append so the unified table stays stable.

Practical Technique: Keep a Stable Column Set

If monthly files sometimes include extra columns, you can explicitly select only the columns you want in each monthly query before appending. This reduces schema drift and makes refreshes more predictable.

Merge: Join Related Tables (Add Attributes by Key)

When to Use Merge

Use merge when you want to enrich a table with columns from another table based on matching keys. Typical examples:

  • Add Customer Name and Segment to Sales using CustomerID.
  • Add Product Category to Sales using ProductID.
  • Add Exchange Rates to Sales using Date + Currency.

Merge is a Power Query step; it creates a nested table column that you then expand to bring in the needed fields.

Choosing Good Join Keys (and Making Them Match)

Selecting Keys

A key should uniquely identify the related record in the lookup table and be present in both tables. Examples:

  • Sales (many rows) to Customers (one row per CustomerID): join on CustomerID.
  • Sales to Products: join on ProductID.
  • Sales to Rates: join on OrderDate + CurrencyCode (composite key).

Reduce Errors: Ensure Key Cleanliness

Most merge problems come from keys that look similar but do not match exactly. Before merging, verify:

  • Whitespace: leading/trailing spaces cause mismatches. Trim keys in both tables.
  • Case: text keys may differ by case. Standardize (e.g., uppercase) if needed.
  • Hidden characters: non-printing characters can break joins. Clean text if you suspect this.
  • Consistent formatting: “00123” vs “123” will not match as text. Decide on a standard representation.

Only create helper keys when you cannot join reliably with existing fields.

Reduce Errors: Remove Duplicates Where Appropriate

For lookup tables (dimension-like tables), duplicates in the key column often indicate data quality issues and can cause ambiguous matches. If the business rule is “one row per key,” remove duplicates on the key column (or fix upstream) before merging.

For fact tables (transaction tables), duplicates are often valid (multiple sales lines for the same CustomerID). Do not remove duplicates there unless you are certain duplicates are erroneous.

Practical Merge Scenario 1: Sales + Customers (Single Key)

Goal

Enrich Sales_AllMonths with CustomerName and CustomerSegment from Customers.

Step-by-Step

  • Open Sales_AllMonths in Power Query.
  • Select Home > Merge Queries (or Merge as New if you want a separate query).
  • Choose the second table: Customers.
  • Select the join columns: click CustomerID in Sales_AllMonths, then click CustomerID in Customers.
  • Choose a join kind (start with Left Outer for enrichment).
  • Confirm to create a new column containing a nested table (often named Customers).
  • Click the expand icon on the nested table column, select CustomerName and CustomerSegment, and uncheck “Use original column name as prefix” if you want cleaner names.

Join Type Guidance (What to Choose and Why)

  • Left Outer: keep all sales rows; bring customer attributes where matches exist. Best default for enriching a fact table.
  • Inner: keep only sales rows that have a matching customer. Useful for diagnosing missing keys, but risky as a final step if you might drop valid sales.
  • Right Outer: less common in this direction; would keep all customers and only matching sales.
  • Full Outer: useful for reconciliation; shows non-matching rows from both sides.
  • Anti joins (Left Anti / Right Anti): excellent for finding mismatches (e.g., sales with unknown customers).

Validate the Merge Result

After expanding, validate that the merge behaved as expected:

  • Check for nulls in CustomerName/Segment. Nulls indicate unmatched CustomerID values.
  • Use a Left Anti join (Sales to Customers) to list sales rows with CustomerIDs not found in Customers.
  • Confirm row count of Sales_AllMonths did not change when using Left Outer. If it increased, you likely joined to a table with duplicate keys, causing row multiplication.

Practical Merge Scenario 2: Sales + Products (Duplicate-Key Risk)

Problem Pattern: Row Multiplication

If Products contains multiple rows per ProductID (e.g., multiple active records, multiple languages, multiple price lists), a merge can duplicate sales rows when expanded, inflating totals.

Step-by-Step Defensive Approach

  • In Products, check whether ProductID is unique. If the business rule expects uniqueness, remove duplicates on ProductID (or filter to the correct record set, such as Active = true).
  • Merge Sales_AllMonths to Products on ProductID using Left Outer.
  • Expand only the columns you need (e.g., ProductName, Category).
  • Immediately validate totals (see validation section below) to ensure no inflation occurred.

Practical Merge Scenario 3: Composite Keys (Date + Currency)

When Composite Keys Are Necessary

Some relationships require more than one column to uniquely identify a match. Example: Exchange rates might be unique by (RateDate, CurrencyCode). In that case, you should merge using both fields.

Step-by-Step: Merge on Multiple Columns

  • In Sales_AllMonths, ensure you have a date column aligned to the rate table’s date grain (e.g., OrderDate as a date, not datetime).
  • Merge Sales_AllMonths with Rates.
  • Select OrderDate and CurrencyCode in Sales_AllMonths, then select the corresponding columns in Rates in the same order.
  • Use Left Outer to keep all sales rows.
  • Expand the Rate column (e.g., FXRate) and compute converted amounts if needed.

Helper Keys: Create Only When Needed

If you cannot reliably select multiple columns (or you need a single join field for operational reasons), you can create a helper key by concatenating cleaned fields. Do this only when necessary, and ensure separators prevent collisions.

// Example helper key: Date + Currency (ensure both are standardized first)  JoinKey = Date.ToText([OrderDate], "yyyy-MM-dd") & "|" & Text.Upper(Text.Trim([CurrencyCode]))

Create the same JoinKey in the Rates table, then merge on JoinKey.

Expanding Merged Columns (Control What You Bring In)

After a merge, you get a nested table column. Expanding it incorrectly can clutter your model or introduce confusing duplicates.

  • Expand only what you need: bring in just the attributes required for reporting and calculations.
  • Avoid bringing in duplicate key columns: you usually do not need the lookup key again after the join.
  • Rename expanded columns if they are ambiguous (e.g., Name could be CustomerName vs ProductName).

Validating Combined Outputs Against Expected Totals

Validation After Append

Appending should increase row count and totals in a predictable way. Validate by comparing:

  • Row counts per month: does each month contribute the expected number of rows?
  • Sum of key measures (e.g., Quantity, SalesAmount): does Sales_AllMonths equal the sum of the individual monthly totals?

A practical check is to keep a small reconciliation table (even a simple reference note) with expected monthly totals and compare them after refresh.

Validation After Merge

Merging for enrichment should not change the number of rows in your fact table when using a Left Outer join to a unique-key lookup. Validate:

  • Row count unchanged (Sales rows before merge vs after merge).
  • Totals unchanged (e.g., total SalesAmount should match exactly before and after merge).
  • Match rate: count how many rows have nulls in the expanded lookup fields; investigate unexpected nulls.

Targeted Diagnostics with Anti Joins

Use anti joins to isolate problems quickly:

  • Sales to Customers (Left Anti): sales rows with CustomerID not found in Customers.
  • Customers to Sales (Left Anti): customers with no sales (useful for coverage checks, not necessarily an error).

Once you identify mismatches, fix the cause (key cleanliness, formatting, missing reference records, or incorrect join columns) and re-validate totals.

Now answer the exercise about the content:

You need to create one consolidated Sales table from separate monthly Sales queries that share the same columns, and then add Customer attributes like CustomerName. Which sequence best follows a reliable Power Query workflow?

You are right! Congratulations, now go to the next page

You missed! Try again.

Append is used to stack tables with the same structure (adds rows). After creating one consolidated Sales table, use Merge on a key like CustomerID to enrich it with customer attributes (adds columns).

Next chapter

Power BI Essentials: Designing a Maintainable Data Model

Arrow Right Icon
Free Ebook cover Power BI Essentials: Data Import, Cleaning, and Modeling with Confidence
56%

Power BI Essentials: Data Import, Cleaning, and Modeling with Confidence

New course

9 pages

Download the app to earn free Certification and listen to the courses in the background, even with the screen off.