All courses > Technology and Programming > Databases ::

Handling Missing Matches: Defaults, COALESCE, and Filtering Without Accidental Inner Joins

Capítulo 10

Estimated reading time: 12 minutes

Why “missing matches” matter after you already chose the right join

Once you use an outer join (most often a LEFT JOIN), you have explicitly decided to keep rows even when the other side has no match. The next challenge is what to do with the columns that become NULL on the non-matching side. If you ignore those NULLs, you can end up with confusing reports (blank labels, missing totals) or, worse, accidentally undo the outer join by filtering in the wrong place.

This chapter focuses on three practical skills that come up constantly in real SQL work:

Providing defaults for missing matches (display defaults and calculation defaults).
Using COALESCE (and related expressions) correctly and intentionally.
Filtering outer-joined queries without turning them into accidental inner joins.

Recognizing “NULL because missing match” vs “NULL because the data is NULL”

In an outer join result, NULLs can appear for two different reasons:

NULL from the base table: the column is actually NULL in the stored row.
NULL from the join: the row on the other side didn’t exist, so every selected column from that table is NULL in the result.

These two cases can look identical in the output but often mean different things. For example, if you join customers to orders and see NULL in orders.order_id, that usually means “no order exists.” But if you see NULL in customers.email, that might mean “email not provided.”

When you apply defaults, be clear about which kind of NULL you are addressing. Defaulting an email to 'unknown' might be acceptable for display, but defaulting an order_id to 0 can be misleading because it suggests an order exists.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Defaults for display: making reports readable without changing meaning

For reporting, you often want a human-friendly placeholder when a match is missing. Typical examples:

Show 'No order' when a customer has none.
Show 'Unassigned' when an employee has no manager.
Show 0 for a missing aggregate like “total spend” (because “no rows” should behave like zero in many business contexts).

The key is to default in the SELECT list (or in a presentation layer) rather than altering join logic. This preserves the correct row set and makes the output clearer.

Example: customers with optional orders (display defaults)

SELECT  c.customer_id, c.name, COALESCE(o.status, 'No order') AS order_status, o.order_id FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id;

Here, customers without orders still appear. For those rows, o.status is NULL because there is no matching order; COALESCE replaces it with 'No order' for readability.

Be careful: defaulting o.order_id to 0 might be confusing. It’s usually better to leave identifiers as NULL and default only descriptive fields.

COALESCE: the workhorse for defaults

COALESCE(a, b, c, ...) returns the first non-NULL value from the list. It is standard SQL and widely supported. You can use it for:

Display defaults: show a placeholder when a column is NULL.
Fallback logic: choose a value from multiple sources (e.g., shipping address vs billing address).
Calculation defaults: treat NULL as 0 for arithmetic, when that matches the business meaning.

Step-by-step: using COALESCE for a fallback chain

Scenario: you want a contact email for each customer. Prefer customers.primary_email, but if it’s NULL, use customers.secondary_email, and if that’s also NULL, show 'no-email@invalid' for display.

SELECT  customer_id, COALESCE(primary_email, secondary_email, 'no-email@invalid') AS contact_email FROM customers;

Step-by-step reasoning:

If primary_email is not NULL, you get it.
Otherwise, if secondary_email is not NULL, you get it.
Otherwise, you get the placeholder.

This pattern is especially useful after joins when you want to prefer a value from one table but fall back to another.

COALESCE for numeric calculations: when NULL should behave like 0

Aggregates and arithmetic are common places where missing matches cause NULLs. For example, if you LEFT JOIN customers to payments and compute total paid, customers with no payments might produce NULL totals (depending on how you write the query). Often, you want 0 instead.

SELECT  c.customer_id, c.name, COALESCE(SUM(p.amount), 0) AS total_paid FROM customers c LEFT JOIN payments p   ON p.customer_id = c.customer_id GROUP BY c.customer_id, c.name;

Important nuance: SUM(p.amount) returns NULL when there are no matching payment rows (because there are no values to sum). Wrapping it in COALESCE(..., 0) makes “no payments” show as 0, which is usually the desired business interpretation.

Defaults in join outputs: choosing between COALESCE and CASE

COALESCE is ideal when “NULL means missing” and you simply want a fallback value. Sometimes you need more explicit logic, such as distinguishing between “missing match” and “matched row with NULL in a column.” In those cases, a CASE expression can be clearer.

Example: distinguish missing match from a real NULL value

Suppose orders.status can be NULL even for real orders (maybe status not set yet). You want:

'No order' when there is no order row.
'Pending' when an order exists but status is NULL.
Otherwise, the actual status.

SELECT  c.customer_id, c.name, CASE    WHEN o.order_id IS NULL THEN 'No order'    WHEN o.status IS NULL THEN 'Pending'    ELSE o.status  END AS order_status FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id;

This uses o.order_id IS NULL as the “missing match” indicator, because if there is no joined row, all o.* columns are NULL. Using a key column (like order_id) is typically the most reliable way to detect missing matches.

Filtering without accidental inner joins: the most common outer join bug

A frequent mistake is to write a LEFT JOIN and then add a WHERE filter on a column from the right table. This can eliminate the NULL-extended rows and effectively turn the query into an inner join, even though the SQL still says LEFT JOIN.

Conceptually, if you filter on o.status = 'PAID' in the WHERE clause, rows where o.status is NULL (i.e., customers with no order) will not satisfy the condition, so they are removed. That defeats the purpose of the outer join.

Example of the accidental inner join

SELECT  c.customer_id, c.name, o.order_id, o.status FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id WHERE o.status = 'PAID';

Despite the LEFT JOIN, this returns only customers who have a matching order with status PAID. Customers with no orders are removed because o.status is NULL and fails the WHERE predicate.

Correct patterns for filtering while preserving unmatched rows

There are several correct patterns. Which one you choose depends on the business question you’re answering.

Pattern 1: Put the filter in the ON clause (keep unmatched rows)

If you want to keep all customers, but only attach orders that meet a condition, move the condition into the join predicate.

SELECT  c.customer_id, c.name, o.order_id, o.status FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id  AND o.status = 'PAID';

Now the meaning is: “Return all customers; join only their PAID orders if they exist.” Customers without PAID orders still appear, with NULLs in o.*.

Step-by-step reasoning:

Start with all customers.
Try to match orders by customer_id.
Additionally require the order to be PAID to count as a match.
If no PAID order exists, the customer row remains with NULLs for order columns.

Pattern 2: Keep the WHERE filter but explicitly allow NULLs (keep unmatched rows)

Sometimes you want to filter on a right-table column but still keep unmatched rows. You can do that by allowing the NULL-extended rows through.

SELECT  c.customer_id, c.name, o.order_id, o.status FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id WHERE o.status = 'PAID' OR o.order_id IS NULL;

This keeps customers with PAID orders and customers with no orders. However, it can be less readable and easier to get wrong, especially when multiple conditions are involved. Prefer the ON-clause pattern when the intent is “filter which rows match,” not “filter which left rows appear.”

Pattern 3: Filter after aggregation (keep left rows, compute conditional metrics)

Often you don’t actually need to filter rows; you need conditional totals or counts. Instead of filtering the joined rows away, compute conditional aggregates.

Example: count paid orders per customer, but still show customers with zero paid orders.

SELECT  c.customer_id, c.name, COALESCE(SUM(CASE WHEN o.status = 'PAID' THEN 1 ELSE 0 END), 0) AS paid_order_count FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id GROUP BY c.customer_id, c.name;

This approach avoids accidental inner joins entirely because you are not filtering out rows; you are counting selectively.

Filtering for “missing matches” intentionally (anti-join style)

Another common task is to find rows that do not have a match (e.g., customers with no orders, products never sold, employees with no manager record). With a LEFT JOIN, you can filter for missing matches by checking a key column from the right table for NULL.

Example: customers with no orders

SELECT  c.customer_id, c.name FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id WHERE o.order_id IS NULL;

This is not an accidental inner join; it’s the opposite. You are intentionally selecting only the unmatched rows.

Step-by-step:

Start with all customers.
Attempt to match orders.
Keep only those where the order key is NULL, meaning no match was found.

Tip: use a non-nullable key column from the right table (like order_id) for the NULL check. Checking a nullable attribute (like o.status) can misclassify rows where an order exists but the attribute is NULL.

Defaults and filtering together: common real-world scenarios

Scenario A: show all customers, label segment, and filter to a region

Imagine customers optionally match a segments table (some customers not segmented yet). You want to show all customers in region 'West' and display 'Unsegmented' when there is no segment match.

SELECT  c.customer_id, c.name, c.region, COALESCE(s.segment_name, 'Unsegmented') AS segment_name FROM customers c LEFT JOIN segments s   ON s.segment_id = c.segment_id WHERE c.region = 'West';

Filtering on the left table (c.region) is safe: it does not remove unmatched segment rows because the customer row is still present.

Scenario B: show all customers, attach only “active” subscriptions, and compute a boolean flag

Suppose customers may have subscriptions, and you only care about active ones. You want all customers, plus a flag indicating whether they currently have an active subscription.

SELECT  c.customer_id, c.name, CASE WHEN s.subscription_id IS NULL THEN 0 ELSE 1 END AS has_active_subscription FROM customers c LEFT JOIN subscriptions s   ON s.customer_id = c.customer_id  AND s.status = 'ACTIVE';

Here the filter belongs in the ON clause because it determines what counts as a match. The CASE then turns “missing match” into a 0/1 indicator.

Scenario C: show all products, but only include products whose latest price is above a threshold (without losing products with no price)

Sometimes the business rule is subtle: you might want to keep products even if they have no price record yet, but if they do have a price, you only want those above a threshold. That’s a classic case where a naive WHERE filter breaks the outer join.

SELECT  p.product_id, p.name, pr.price FROM products p LEFT JOIN product_prices pr   ON pr.product_id = p.product_id  AND pr.is_current = 1 WHERE pr.price >= 100 OR pr.product_id IS NULL;

This keeps products with current price ≥ 100 and products with no current price row. If you instead put WHERE pr.price >= 100 alone, you would drop products with no price row.

Depending on your SQL dialect and data model, you might prefer moving the threshold into the ON clause and then filtering on the left side only, but note the difference in meaning: putting pr.price >= 100 in ON means “match only expensive prices,” which will show cheap-priced products as if they had no price (NULL). That may or may not be acceptable. The OR pattern above distinguishes “cheap price exists” from “no price exists” by keeping only expensive or missing.

COALESCE in predicates: powerful but easy to misuse

You can use COALESCE inside a WHERE clause, but be careful: it can change the meaning of NULLs and can hide the difference between “missing match” and “real NULL.”

Example: treating missing status as a specific value

SELECT  c.customer_id, c.name, o.status FROM customers c LEFT JOIN orders o   ON o.customer_id = c.customer_id WHERE COALESCE(o.status, 'NO_ORDER') IN ('PAID', 'NO_ORDER');

This keeps customers with PAID orders and customers with no orders. It reads nicely, but it also treats orders with NULL status as 'NO_ORDER', which may be wrong. If NULL status means “unknown yet,” you just mixed it with “no order exists.” In that case, prefer explicit checks like o.order_id IS NULL for missing matches.

Practical checklist: avoid surprises with missing matches

Default in SELECT, not in join logic, when your goal is readability (labels like 'Unassigned', 'No order').
Use COALESCE for simple fallback; use CASE when you must distinguish different kinds of NULL.
When filtering on right-table columns in an outer join, decide whether the filter defines what counts as a match (put it in ON) or whether it defines which left rows should be included (put it in WHERE, but then handle NULL-extended rows intentionally).
To find missing matches, filter with WHERE right_table.key IS NULL using a non-nullable key column.
For metrics, prefer conditional aggregation over filtering away rows, so you can keep the full left-side population and compute zeros correctly.

Now answer the exercise about the content:

In a LEFT JOIN between customers and orders, you want to keep all customers but only attach orders with status PAID. Which approach best preserves customers who have no PAID orders?

You are right! Congratulations, now go to the next page

You missed! Try again.

Placing the filter in the ON clause keeps all customers and only matches PAID orders. A WHERE filter on the right table removes NULL-extended rows and can turn the result into an accidental inner join.

Next chapter

Avoiding Duplicate Explosions: One-to-Many Traps, Pre-Aggregation, and De-Duping

83%

SQL Joins Demystified: Inner, Left, Right, Full, and Self Joins

New course

12 pages