Power BI Essentials: Data Types, Locale, and Quality Checks in Power Query

Capítulo 4

Estimated reading time: 8 minutes

+ Exercise

Why data types matter (and what breaks when they’re wrong)

In Power BI, relationships, aggregations, and DAX measures depend on columns having the correct data type. If a numeric column is stored as text, totals may concatenate or fail to sum. If dates are imported as text, time intelligence functions and date relationships won’t work reliably. If a key column’s type differs between tables (for example, one is Whole Number and the other is Text), relationships may not match and filters won’t propagate as expected.

Power Query is the best place to enforce types because it creates a repeatable transformation pipeline. The goal is to set types early, validate them, and fix issues before loading to the model.

Step-by-step workflow: inspect, type, validate, and fix

Step 1: Inspect columns before changing anything

Start by scanning your columns for “type smells”: numbers that look like text (left-aligned, leading apostrophes), dates in multiple formats, IDs with leading zeros, and columns that mix values like numbers plus words (for example, “10”, “N/A”, “unknown”).

  • Check a few rows at the top and bottom (errors often appear later in the file).
  • Look for columns that should be keys (CustomerID, ProductCode) and confirm whether they should be Text or Number.
  • Identify locale-sensitive columns: dates, decimals, currencies, and percentages.

Step 2: Set data types early (but in the right order)

Apply data types near the beginning of the query so downstream steps behave consistently (sorting, grouping, merging, and calculations). However, if you need to clean characters first (like removing currency symbols), do that immediately before typing.

Recommended order:

Continue in our app.
  • Listen to the audio with the screen off.
  • Earn a certificate upon completion.
  • Over 5000 courses for you to explore!
Or continue reading below...
Download App

Download the app

  • Rename columns (optional, but helps readability).
  • Trim/Clean text where needed (especially for keys).
  • Fix obvious formatting issues (currency symbols, thousand separators, stray spaces).
  • Set data types.
  • Validate with profiling.

Common type choices:

  • Keys/IDs: usually Text (preserves leading zeros and avoids scientific notation).
  • Quantities: Whole Number if truly integer; otherwise Decimal Number.
  • Money: often Fixed Decimal Number to reduce floating rounding issues.
  • Dates: Date (or Date/Time if time is meaningful).
  • Flags: True/False when possible.

In the UI, use the data type icon in the column header. Power Query will add a Changed Type step. Keep it, but ensure it happens after any necessary cleaning.

Step 3: Validate using Column Profiling

Use Power Query’s profiling tools to confirm the data matches expectations. Turn on profiling from the View tab:

  • Column quality: shows Valid / Error / Empty percentages.
  • Column distribution: shows distinct and unique counts plus a histogram for some types.
  • Column profile: shows statistics like min, max, average, and standard deviation (for numeric), and length stats (for text).

Important: set profiling to evaluate the entire dataset, not just the first 1,000 rows, when feasible. Otherwise, you may miss errors that appear later.

Locale-specific handling: dates, decimals, and text-to-number conversions

Why locale breaks conversions

Locale affects how Power Query interprets separators and date ordering:

  • Decimals: 1,23 vs 1.23
  • Thousands: 1,234 vs 1.234
  • Dates: 01/02/2025 could be Jan 2 or Feb 1 depending on locale

If you apply a simple type change without specifying locale, Power Query uses the file/system locale, which can silently misinterpret values.

Best practice: use “Using Locale” for text-to-date and text-to-number

When a column is currently Text but represents dates or numbers, convert it using a locale-aware conversion.

  • Select the column.
  • Change Type > Using Locale…
  • Choose the target type (Date, Decimal Number, etc.).
  • Select the correct locale that matches the source formatting.

This is especially important for CSVs and exports from regional systems.

// Example M pattern: convert text to date using a specific locale (e.g., en-GB for dd/MM/yyyy)  Table.TransformColumnTypes(      Source,      { {"OrderDate", type date} },      "en-GB"  )
// Example M pattern: convert text to number using a specific locale (e.g., de-DE for comma decimals)  Table.TransformColumnTypes(      Source,      { {"Amount", type number} },      "de-DE"  )

Common pitfalls and reliable fixes

Pitfall 1: Mixed-type columns (numbers + text like “N/A”)

Symptoms:

  • Type conversion produces errors.
  • Column quality shows a non-zero Error rate.
  • Aggregations in the model behave unexpectedly.

Fix options (choose based on business meaning):

  • Replace non-numeric placeholders with null (recommended when “N/A” means missing).
  • Split into two columns: one numeric value and one status/reason.
  • Filter out invalid rows if they should not be loaded.
// Replace common placeholders with null before typing  = Table.ReplaceValue(      Source,      "N/A",      null,      Replacer.ReplaceValue,      {"Quantity"}  )

Pitfall 2: Inconsistent date formats in the same column

Example: some rows are 2025-01-16, others are 16/01/2025, others are 01/16/2025. A single locale conversion may parse some rows incorrectly or fail.

Fix approach:

  • If possible, standardize at the source export settings.
  • Otherwise, normalize in Power Query by detecting patterns and parsing accordingly.
// Practical approach: try multiple parses and coalesce (pattern depends on your data)  = Table.AddColumn(      Source,      "OrderDate_Parsed",      each      let          t = [OrderDateText],          d1 = try Date.FromText(t, "en-GB") otherwise null,          d2 = try Date.FromText(t, "en-US") otherwise null,          d3 = try Date.FromText(t, "") otherwise null      in          if d1 <> null then d1 else if d2 <> null then d2 else d3,      type date  )

After creating a parsed date column, validate it (min/max, null rate), then remove the original text date column or keep it for auditing.

Pitfall 3: Numeric fields stored as text (and why “Changed Type” isn’t enough)

Text numbers often include formatting characters:

  • Currency symbols: $1,200.50
  • Spaces: 1 200,50
  • Parentheses for negatives: (1,200.50)
  • Percent signs: 12.5%

Fix approach:

  • Remove non-numeric characters (currency, percent, spaces).
  • Normalize negative formats (parentheses to minus).
  • Convert using locale.
// Example: clean currency text then convert using locale  let      Cleaned = Table.TransformColumns(          Source,          {              {"AmountText", each Text.Trim(_), type text},              {"AmountText", each Text.Replace(_, "(", "-"), type text},              {"AmountText", each Text.Replace(_, ")", ""), type text},              {"AmountText", each Text.Replace(_, "$", ""), type text},              {"AmountText", each Text.Replace(_, ",", ""), type text}          }      ),      Typed = Table.TransformColumnTypes(Cleaned, {{"AmountText", type number}}, "en-US")  in      Typed

If your source uses comma decimals, remove thousand separators carefully and use the correct locale. Avoid blanket replacements that can swap meaning (for example, removing all periods when periods are decimals).

Pitfall 4: IDs accidentally typed as numbers

Symptoms:

  • Leading zeros disappear (e.g., 001234 becomes 1234).
  • Long IDs display in scientific notation.
  • Merges/relationships fail because one table kept the ID as text.

Fix:

  • Set ID columns to Text early.
  • If leading zeros are already lost, you may need to re-import or reconstruct using known width (only if the business rules guarantee it).
// Pad an ID to 6 characters (only if you are certain this is correct)  = Table.TransformColumns(      Source,      {{"CustomerID", each Text.PadStart(Text.From(_), 6, "0"), type text}}  )

Repeatable quality checks before loading to the model

1) Row count checks

Confirm the number of rows matches expectations (source report totals, file metadata, or prior refresh). Sudden drops often indicate filtering, parsing failures, or join issues.

  • Use profiling to spot unexpected Empty/Error spikes.
  • Optionally create a quick reference query that returns row counts per table for monitoring.
// Row count as a single value (can be used in a separate audit query)  = Table.RowCount(YourTable)

2) Distinct and unique counts for keys

For dimension tables, primary keys should typically be unique and non-null. Use Column distribution to check:

  • Distinct count: how many different values exist.
  • Unique count: how many values appear only once.

If Unique count is less than row count for a supposed key, you have duplicates that can break one-to-many relationships.

// Find duplicate keys  let      Grouped = Table.Group(          DimCustomer,          {"CustomerID"},          {{"Rows", each Table.RowCount(_), Int64.Type}}      ),      Duplicates = Table.SelectRows(Grouped, each [Rows] > 1)  in      Duplicates

3) Null rate checks on critical columns

Track null/empty rates for columns that must be present (keys, dates, amounts). A rising null rate often indicates upstream changes or parsing problems.

// Null count for a column  = List.Count(List.Select(Table.Column(YourTable, "OrderDate"), each _ = null))

4) Sanity checks on min/max (and other quick stats)

Before loading, validate that numeric and date ranges make sense:

  • Dates: min/max should fall within the expected business period.
  • Amounts: min/max should not include impossible values (e.g., extremely large numbers caused by separator misreads).
  • Quantities: check for negative values if they’re not allowed.
// Min/Max checks for a numeric column  let      Values = List.RemoveNulls(Table.Column(FactSales, "Amount")),      MinVal = List.Min(Values),      MaxVal = List.Max(Values)  in      [Min = MinVal, Max = MaxVal]
// Min/Max checks for a date column  let      Values = List.RemoveNulls(Table.Column(FactSales, "OrderDate")),      MinDate = List.Min(Values),      MaxDate = List.Max(Values)  in      [MinDate = MinDate, MaxDate = MaxDate]

Putting it into a consistent checklist

Use this sequence on every query before loading:

  • Confirm expected columns exist and are named correctly.
  • Set/verify data types (especially keys, dates, numeric measures).
  • Check Column quality: errors = 0 for critical columns; empties within tolerance.
  • Check key uniqueness (dimension tables) and key completeness (no null keys).
  • Validate row counts against expectations.
  • Validate distinct counts for key columns and key combinations.
  • Validate null rates for critical fields.
  • Run min/max sanity checks for dates and numeric columns to catch locale and parsing issues.

Now answer the exercise about the content:

When converting a Text column that contains dates or numbers in a regional format, what is the most reliable way to avoid silent misinterpretation in Power Query?

You are right! Congratulations, now go to the next page

You missed! Try again.

Locale affects how decimals, thousands separators, and date order are parsed. Using Using Locale… converts Text to the intended type with the correct regional rules, preventing silent conversion errors.

Next chapter

Power BI Essentials: Combining Data with Merge and Append

Arrow Right Icon
Free Ebook cover Power BI Essentials: Data Import, Cleaning, and Modeling with Confidence
44%

Power BI Essentials: Data Import, Cleaning, and Modeling with Confidence

New course

9 pages

Download the app to earn free Certification and listen to the courses in the background, even with the screen off.