Free Ebook cover Excel Power Query Playbook: Repeatable Data Prep Without VBA

Excel Power Query Playbook: Repeatable Data Prep Without VBA

New course

10 pages

Building a Library of Common Transformations and Reapplying Steps

Capítulo 8

Estimated reading time: 13 minutes

+ Exercise

Why a “transformation library” matters

In Power Query, most work is not “hard,” it is repetitive: trim and clean text, normalize headers, enforce data types, split composite fields, standardize codes, remove totals rows, and reshape the same patterns across many sources. A transformation library is a set of reusable queries and functions that encapsulate these patterns so you can apply them consistently, quickly, and with fewer mistakes.

Think of your library as a toolbox with two kinds of tools:

  • Template queries: ready-made queries you duplicate and point at a new source.
  • Functions: parameterized transformations you can call from other queries, especially when processing many tables (for example, files in a folder or tables from multiple systems).

The goal is not to create “one mega-query” that does everything. The goal is to create small, well-named building blocks that you can compose: one block for standardizing column names, one for removing non-data rows, one for enforcing types, one for adding derived columns, and so on.

What belongs in a common transformation library

Good library candidates share three traits: they are used often, they are stable (the logic doesn’t change every week), and they are easy to parameterize. Common categories include:

  • Column name normalization: consistent casing, replacing spaces/underscores, mapping synonyms (e.g., “Cust ID” → “CustomerID”).
  • Schema alignment: ensuring required columns exist, adding missing columns with nulls, ordering columns.
  • Row filtering patterns: removing blank rows, removing “Total” lines, keeping only valid records.
  • Standard calculations: deriving Year/Month keys, creating composite keys, adding flags.
  • Validation and diagnostics: returning counts, listing unexpected columns, checking duplicates.
  • Reusable joins: consistent lookups against reference tables (e.g., mapping region codes).

Some transformations are better kept local to a specific model because they depend heavily on business rules that change frequently. Your library should focus on transformations that are broadly applicable and unlikely to be controversial.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

Design principles for reusable steps

1) Prefer functions for repeatable logic

When you find yourself copying the same 6–15 steps between queries, convert them into a function. Functions reduce drift: if you fix a bug in one place, every query that calls the function benefits.

2) Keep functions small and composable

A function that “cleans everything” becomes hard to reuse because it assumes too much. Instead, create functions like fxNormalizeHeaders, fxRemoveNonDataRows, fxEnforceSchema, and chain them in a wrapper query if needed.

3) Make inputs explicit

Functions should accept the minimum required inputs. Typically that’s a table plus optional settings. Use optional parameters for behavior that varies (for example, whether to drop unknown columns).

4) Return tables (or records) predictably

If a function returns a table, always return a table. If you need to return both data and diagnostics, return a record with fields like [Data=..., Issues=...]. Predictable outputs make downstream steps stable.

5) Name steps and queries for maintenance

Readable step names are not cosmetic; they are your documentation. Replace default step names like Changed Type with names like Types_Enforced or Headers_Normalized. In a library, naming is how you communicate intent.

Creating a “Library” group in Power Query

In Excel’s Power Query Editor, you can organize queries into groups. Create a group named Library (or Functions) and move reusable items there. This is not just tidiness: it prevents accidental loading of helper queries and makes it clear what is meant to be reused.

  • Right-click in the Queries pane → New Group → name it Library.
  • Move helper queries and functions into that group.
  • For helper queries that should not load to a worksheet or data model, right-click → disable Enable load.

Turning a query’s steps into a reusable function (step-by-step)

A practical way to build your library is to start with a working query, then convert it into a function once you trust the logic.

Example: Convert a “cleanup” query into a function

Assume you have a query that cleans a table: it trims text columns, standardizes column names, removes blank rows, and enforces types. You want to reuse it across multiple sources.

Step 1: Start from a query that already works

In Power Query Editor, pick a query that represents the transformation you want to reuse. Make sure it starts from a step that produces a table (not a file binary). Identify the step that you want to treat as the “input” to your function.

Step 2: Create a function from the query

One common approach is:

  • Duplicate the query (so you keep the original).
  • Rename the duplicate to something like fxCleanupStandard.
  • Open Advanced Editor.
  • Replace the source step with a parameter and wrap the query in a function signature.

Here is a simplified function pattern you can adapt. It accepts a table and returns a cleaned table:

let fxCleanupStandard = (InputTable as table) as table => let Source = InputTable, ColumnNames = Table.ColumnNames(Source), NormalizedNames = List.Transform(ColumnNames, each Text.Replace(Text.Replace(Text.Trim(_), " ", ""), "_", "")), Renamed = Table.RenameColumns(Source, List.Zip({ColumnNames, NormalizedNames}), MissingField.Ignore), RemoveBlankRows = Table.SelectRows(Renamed, each List.NonNullCount(Record.FieldValues(_)) > 0), TrimText = Table.TransformColumns(RemoveBlankRows, List.Transform(Table.ColumnsOfType(RemoveBlankRows, {type text}), each {_, Text.Trim, type text})), Result = TrimText in Result in fxCleanupStandard

This example intentionally keeps the logic generic. In your own library, you may prefer a more controlled rename strategy (for example, a mapping table) rather than removing spaces/underscores blindly.

Step 3: Test the function

Create a new query that references an existing table and calls the function:

let Source = Excel.CurrentWorkbook(){[Name="RawData"]}[Content], Clean = fxCleanupStandard(Source) in Clean

Validate that the output matches your original query’s output. If it doesn’t, compare step-by-step by temporarily returning intermediate steps from the function (for example, return Renamed instead of Result).

Step 4: Make it robust with optional parameters

As soon as you reuse a function, you’ll discover variations. Instead of cloning the function, add optional parameters. Example: allow the caller to decide whether to remove blank rows.

let fxCleanupStandard = (InputTable as table, optional RemoveBlanks as nullable logical) as table => let Source = InputTable, DoRemove = if RemoveBlanks = null then true else RemoveBlanks, RemoveBlankRows = if DoRemove then Table.SelectRows(Source, each List.NonNullCount(Record.FieldValues(_)) > 0) else Source, TrimText = Table.TransformColumns(RemoveBlankRows, List.Transform(Table.ColumnsOfType(RemoveBlankRows, {type text}), each {_, Text.Trim, type text})) in TrimText in fxCleanupStandard

Optional parameters are a practical way to keep one function flexible without turning it into a complicated configuration system.

Building a schema enforcement function you can reuse

One of the most valuable library items is schema enforcement: ensuring that a table has the columns you expect, with the types you expect, and optionally dropping columns you don’t want.

Concept: define schema as a list of column/type pairs

In M, a common schema representation is a list of lists, where each inner list is {"ColumnName", Type}. This matches what Table.TransformColumnTypes expects.

Function: enforce required columns and types

This function adds missing columns (as nulls), enforces types for columns that exist, and optionally removes extra columns.

let fxEnforceSchema = (InputTable as table, Schema as list, optional DropExtras as nullable logical) as table => let Source = InputTable, RequiredNames = List.Transform(Schema, each _{0}), ExistingNames = Table.ColumnNames(Source), Missing = List.Difference(RequiredNames, ExistingNames), AddedMissing = List.Accumulate(Missing, Source, (state, col) => Table.AddColumn(state, col, each null)), Typed = Table.TransformColumnTypes(AddedMissing, Schema, "en-US"), DoDrop = if DropExtras = null then false else DropExtras, Result = if DoDrop then Table.SelectColumns(Typed, RequiredNames, MissingField.UseNull) else Typed in Result in fxEnforceSchema

How to use it

Create a schema definition query (disabled load) in your Library group:

let SalesSchema = { {"OrderID", Int64.Type}, {"OrderDate", type date}, {"CustomerID", type text}, {"Amount", type number} } in SalesSchema

Then apply it in a data query:

let Source = Excel.CurrentWorkbook(){[Name="SalesRaw"]}[Content], Clean = fxEnforceSchema(Source, SalesSchema, true) in Clean

This pattern makes your transformations resilient when columns appear in a different order, when a column is missing in a particular file, or when a new “Notes” column suddenly shows up and you want to ignore it.

Creating a “transformation pipeline” you can reapply

Once you have several small functions, you can create a pipeline function that applies them in a standard order. This gives you a single call for common use cases while still keeping the underlying components reusable.

Example: pipeline function that composes library functions

let fxStandardPipeline = (InputTable as table) as table => let Step1 = fxCleanupStandard(InputTable, true), Step2 = fxEnforceSchema(Step1, SalesSchema, false) in Step2 in fxStandardPipeline

Use pipeline functions when you want a “default” transformation sequence for a domain (Sales, Inventory, Tickets) but still want the freedom to call individual functions when needed.

Reapplying steps across queries without copying everything

There are three practical ways to “reapply steps” in Power Query. Each has a different maintenance profile.

Approach A: Reference a query and continue transforming

If you have a base query that produces a clean table, you can create other queries that reference it and add additional steps. This is a good approach when multiple outputs share the same base cleaning.

  • Create a base query: Sales_Base (cleaning + schema enforcement).
  • Create a referencing query: Sales_ForPivot that references Sales_Base and adds only pivot-specific steps.
  • Create another referencing query: Sales_ForModel that references Sales_Base and adds model-specific columns.

Benefit: one place to fix base issues. Tradeoff: if you need the same base logic across workbooks, you still need a way to transport it (see later sections).

Approach B: Use functions and call them from each query

This is the most scalable approach when you have many similar sources. Each query stays short: source step + function call(s). Maintenance is centralized in the function.

Approach C: Copy/paste steps (only as a temporary bridge)

Sometimes you need a quick start. Copying steps is fine for prototyping, but it tends to create “logic forks.” If you copy steps more than once, treat it as a signal to convert to a function or a referenced base query.

Applying a library function to many tables (step-by-step)

A common scenario is: you have multiple tables (or multiple queries) that all need the same transformation. Instead of repeating steps, you can store the tables in a list and transform them with a function.

Scenario: multiple named tables in a workbook

Suppose your workbook contains several Excel tables: Sales_Jan, Sales_Feb, Sales_Mar, each with the same general structure but inconsistent cleanliness.

Step 1: Get the list of tables

let Source = Excel.CurrentWorkbook(), OnlySales = Table.SelectRows(Source, each Text.StartsWith([Name], "Sales_")) in OnlySales

Step 2: Apply your pipeline function to each table

Add a custom column that calls your function on the [Content] table:

let Source = Excel.CurrentWorkbook(), OnlySales = Table.SelectRows(Source, each Text.StartsWith([Name], "Sales_")), Transformed = Table.AddColumn(OnlySales, "Clean", each fxStandardPipeline([Content])) in Transformed

Step 3: Combine the results

Now you have a column of tables. Combine them into one:

let Source = Excel.CurrentWorkbook(), OnlySales = Table.SelectRows(Source, each Text.StartsWith([Name], "Sales_")), Transformed = Table.AddColumn(OnlySales, "Clean", each fxStandardPipeline([Content])), Combined = Table.Combine(Transformed[Clean]) in Combined

This pattern is the heart of reapplying steps at scale: define the transformation once, then map it over a list of tables.

Storing transformation “recipes” as data (advanced but practical)

Sometimes you want a library that is configurable without editing M code. One approach is to store transformation settings in a table (for example, a mapping of old column names to new column names) and have your functions read that table.

Example: column rename mapping table

Create an Excel table named RenameMap with columns From and To. Then load it as a query and use it in a function:

let fxRenameByMap = (InputTable as table, MapTable as table) as table => let Pairs = List.Transform(Table.ToRecords(MapTable), each {_[From], _[To]}), Renamed = Table.RenameColumns(InputTable, Pairs, MissingField.Ignore) in Renamed in fxRenameByMap

Now your “library” can be updated by editing the mapping table rather than editing M. This is especially useful when business users maintain naming conventions.

Packaging and moving your library between workbooks

Once you invest in a library, you will want to reuse it across files. In Excel, there is no single “global Power Query library” built in, but you can still package and transport your work reliably.

Option 1: Copy queries via the Queries pane

You can copy a function query from one workbook and paste it into another workbook’s Power Query Editor. This is straightforward for small libraries. The key is to keep dependencies clear: if fxStandardPipeline depends on fxCleanupStandard and SalesSchema, copy all of them together.

Option 2: Use a “Library workbook” as a source

Create a dedicated workbook that contains only your library queries (functions, schemas, mapping tables). Then, in a new workbook, connect to that workbook and bring in the library items. The practical idea is to treat the library workbook as a maintained artifact.

Important: when you import queries from another workbook, you typically bring in the resulting tables, not the query definitions. To reuse logic, you often still need to copy the M code (functions) or maintain a shared template workbook that you duplicate when starting new projects.

Option 3: Maintain a “starter template” workbook

For many teams, the most reliable approach is a template Excel file that already contains the Library group and a few example queries wired to nothing (or to sample data). When starting a new model, you copy the template and then connect to the real data sources.

Documenting your library inside Power Query

A library is only reusable if people can understand it. Power Query gives you a few lightweight ways to document without external tools:

  • Query descriptions: right-click a query → Properties → add a description explaining inputs/outputs and assumptions.
  • Step names: use consistent prefixes like Headers_, Rows_, Types_, Keys_.
  • Function parameter names: prefer InputTable, Schema, DropExtras over vague names.

If you want richer documentation, you can also create a “ReadMe” query that returns a small table describing each function and its purpose (disabled load). This keeps documentation inside the workbook where it’s most likely to be seen.

Debugging and hardening reusable transformations

Reusable steps must handle variation gracefully. When a function fails, it often fails everywhere it’s used. Build in safeguards:

Handle missing columns intentionally

Prefer MissingField.Ignore or MissingField.UseNull in rename/select operations when appropriate. Decide which columns are truly required and fail fast only for those.

Return diagnostics when needed

For critical pipelines, consider returning a record with both data and issues. Example pattern:

let fxWithIssues = (InputTable as table) as record => let Source = InputTable, Cols = Table.ColumnNames(Source), MissingCritical = List.Difference({"OrderID","OrderDate"}, Cols), Issues = if List.Count(MissingCritical) > 0 then "Missing: " & Text.Combine(MissingCritical, ", ") else null, Data = Source, Result = [Data=Data, Issues=Issues] in Result in fxWithIssues

Then downstream, you can expand [Data] for normal use and optionally surface [Issues] in a monitoring table.

Keep functions deterministic

A library function should avoid relying on the current workbook state in surprising ways. For example, a function that reads a named range implicitly can be hard to reuse. Prefer passing in the mapping table or schema explicitly.

Putting it together: a repeatable pattern for new projects

When you start a new Power Query project and want to maximize reuse, a practical workflow is:

  • Create or open your template workbook that already contains a Library group.
  • Connect to the new source(s) and create minimal staging queries that only reach the “table” stage.
  • Apply your library pipeline functions to produce standardized base tables.
  • Build downstream, purpose-specific queries by referencing the standardized base tables.
  • When you notice repeated downstream logic, promote it into a new function and add it to the library.

This approach keeps your transformations consistent across datasets and makes future maintenance much cheaper: most changes happen in a small number of library functions rather than across dozens of copied step sequences.

Now answer the exercise about the content:

Which approach best supports scaling the same data-cleaning logic across many similar sources while keeping maintenance centralized?

You are right! Congratulations, now go to the next page

You missed! Try again.

Reusable functions keep logic consistent and reduce drift because fixes happen once. Pipeline functions can compose small steps, and the same function can be applied to many tables by adding a custom column and combining results.

Next chapter

Refresh Strategies, Privacy Levels, and Performance-Safe Design

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.