All courses > Technology and Programming > Databases ::

Full Join: Combining Matches with Both Sides’ Unmatched Rows

Capítulo 5

Estimated reading time: 12 minutes

What a FULL JOIN is and when to use it

A FULL JOIN (often written as FULL OUTER JOIN) returns all matched rows between two tables plus the unmatched rows from both sides. Where a match does not exist, the columns from the missing side are returned as NULL.

Use a FULL JOIN when you need a complete reconciliation view between two datasets: you want to see what lines up, what exists only in the first table, and what exists only in the second table—all in one result set.

Data reconciliation: compare your billing system vs. payment processor records.
Migration validation: compare old system vs. new system exports.
Inventory alignment: compare warehouse stock vs. storefront catalog.
Audit and exception reporting: find “missing on either side” cases without running multiple queries.

Core syntax and what to expect in the output

The basic shape is:

SELECT  columns_you_need
FROM    table_a a
FULL OUTER JOIN table_b b
        ON a.key = b.key;

In the result:

If a row in a matches a row in b, you get one combined row (or multiple rows if there are multiple matches; more on that later).
If a row exists only in a, the b.* columns are NULL.
If a row exists only in b, the a.* columns are NULL.

A practical implication: you will often need to use COALESCE to pick a “best available” identifier when one side is missing.

Continue in our app.

Listen to the audio with the screen off.
Earn a certificate upon completion.
Over 5000 courses for you to explore!

Or continue reading below...

Download the app

Example schema used in this chapter

Imagine you are reconciling orders recorded in your application with invoices recorded in your accounting system.

-- app_orders: what the app believes was ordered
app_orders(order_id, customer_id, order_total, order_date)

-- invoices: what accounting believes was invoiced
invoices(invoice_id, order_id, invoiced_total, invoice_date)

Not every order has an invoice yet, and sometimes invoices exist without a corresponding order record (for example, manual invoices or imported legacy items).

Practical step-by-step: building a reconciliation report with FULL JOIN

Step 1: Start with the simplest FULL JOIN

Begin by joining on the shared business key (order_id).

SELECT
  a.order_id AS app_order_id,
  b.order_id AS invoice_order_id,
  a.customer_id,
  a.order_total,
  b.invoice_id,
  b.invoiced_total
FROM app_orders a
FULL OUTER JOIN invoices b
  ON a.order_id = b.order_id;

This result will include:

Rows where a.order_id = b.order_id (matched)
Rows where a.order_id exists but b.order_id is NULL (order not invoiced)
Rows where b.order_id exists but a.order_id is NULL (invoice without order)

Step 2: Create a unified identifier with COALESCE

When you display or group results, you typically want a single column that contains the order identifier whether it came from the app or the invoice side.

SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.customer_id,
  a.order_total,
  b.invoice_id,
  b.invoiced_total
FROM app_orders a
FULL OUTER JOIN invoices b
  ON a.order_id = b.order_id;

COALESCE returns the first non-NULL value, so it will pick a.order_id when present, otherwise b.order_id.

Step 3: Add a status label to classify each row

A reconciliation report is much easier to read if each row is categorized.

SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.customer_id,
  a.order_total,
  b.invoice_id,
  b.invoiced_total,
  CASE
    WHEN a.order_id IS NOT NULL AND b.order_id IS NOT NULL THEN 'matched'
    WHEN a.order_id IS NOT NULL AND b.order_id IS NULL THEN 'order_only'
    WHEN a.order_id IS NULL AND b.order_id IS NOT NULL THEN 'invoice_only'
  END AS reconciliation_status
FROM app_orders a
FULL OUTER JOIN invoices b
  ON a.order_id = b.order_id;

This pattern is extremely common: FULL JOIN + CASE classification.

Step 4: Compute differences and flag discrepancies

Once matched rows are present, you can compare amounts. Be careful: if one side is missing, arithmetic with NULL yields NULL, so you may want COALESCE for numeric comparisons.

SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.order_total,
  b.invoiced_total,
  (COALESCE(b.invoiced_total, 0) - COALESCE(a.order_total, 0)) AS amount_delta,
  CASE
    WHEN a.order_id IS NOT NULL AND b.order_id IS NOT NULL
         AND a.order_total = b.invoiced_total THEN 'ok'
    WHEN a.order_id IS NOT NULL AND b.order_id IS NOT NULL
         AND a.order_total <> b.invoiced_total THEN 'amount_mismatch'
    WHEN a.order_id IS NOT NULL AND b.order_id IS NULL THEN 'missing_invoice'
    WHEN a.order_id IS NULL AND b.order_id IS NOT NULL THEN 'missing_order'
  END AS issue
FROM app_orders a
FULL OUTER JOIN invoices b
  ON a.order_id = b.order_id;

Note that COALESCE(..., 0) is useful for computing a delta, but you should still interpret it carefully: a missing invoice is not the same as an invoice of 0. That is why the separate issue label is important.

Step 5: Filter for exceptions without losing the FULL JOIN meaning

A common mistake is to add a WHERE clause that unintentionally removes the unmatched rows you wanted to keep. For example, if you write WHERE b.invoice_date >= '2026-01-01', then rows where b is NULL (order-only rows) will be filtered out.

Instead, apply filters in a way that preserves the intended rows. You have a few options:

Filter each side in a subquery/CTE before the FULL JOIN.
Use WHERE conditions that explicitly allow NULLs.

Filtering each side before joining is often the clearest:

WITH a AS (
  SELECT *
  FROM app_orders
  WHERE order_date >= DATE '2026-01-01'
),
b AS (
  SELECT *
  FROM invoices
  WHERE invoice_date >= DATE '2026-01-01'
)
SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.order_total,
  b.invoiced_total,
  CASE
    WHEN a.order_id IS NOT NULL AND b.order_id IS NOT NULL THEN 'matched'
    WHEN a.order_id IS NOT NULL AND b.order_id IS NULL THEN 'order_only'
    ELSE 'invoice_only'
  END AS reconciliation_status
FROM a
FULL OUTER JOIN b
  ON a.order_id = b.order_id;

This keeps the FULL JOIN semantics intact while limiting each dataset to the time window you care about.

Understanding row multiplication: FULL JOIN does not “deduplicate”

A FULL JOIN can produce more rows than either input table if the join key is not unique on one or both sides. This is not a special property of FULL JOIN; it is a property of joins in general. But it matters a lot in reconciliation work because it can create confusing results that look like duplicates.

Example: multiple invoices per order

If invoices contains multiple rows with the same order_id (partial invoices, adjustments, etc.), then one app order row will match multiple invoice rows, producing multiple joined rows.

SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.order_total,
  b.invoice_id,
  b.invoiced_total
FROM app_orders a
FULL OUTER JOIN invoices b
  ON a.order_id = b.order_id;

If you intended to compare totals at the order level, you should aggregate invoices first, then FULL JOIN the aggregated result to orders.

WITH invoice_totals AS (
  SELECT
    order_id,
    SUM(invoiced_total) AS invoiced_total_sum,
    COUNT(*) AS invoice_count
  FROM invoices
  GROUP BY order_id
)
SELECT
  COALESCE(a.order_id, it.order_id) AS order_id,
  a.order_total,
  it.invoiced_total_sum,
  it.invoice_count,
  CASE
    WHEN a.order_id IS NULL THEN 'invoice_only'
    WHEN it.order_id IS NULL THEN 'order_only'
    WHEN a.order_total = it.invoiced_total_sum THEN 'matched'
    ELSE 'amount_mismatch'
  END AS status
FROM app_orders a
FULL OUTER JOIN invoice_totals it
  ON a.order_id = it.order_id;

This step-by-step approach—aggregate to the grain you want, then FULL JOIN—is a reliable way to avoid accidental row multiplication.

FULL JOIN and NULLs: matching behavior and pitfalls

NULL does not equal NULL in join conditions

In SQL, NULL = NULL is not true; it is unknown. That means if both tables have NULL in the join key, they will not match each other with a normal equality condition.

If your business logic treats NULL keys as “the same bucket,” you must handle that explicitly. Some databases support IS NOT DISTINCT FROM (PostgreSQL, for example), which treats NULLs as equal for comparison purposes.

SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.customer_id,
  b.invoice_id
FROM app_orders a
FULL OUTER JOIN invoices b
  ON a.order_id IS NOT DISTINCT FROM b.order_id;

If your database does not support that operator, you can emulate it with a compound condition:

SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.customer_id,
  b.invoice_id
FROM app_orders a
FULL OUTER JOIN invoices b
  ON (a.order_id = b.order_id)
  OR (a.order_id IS NULL AND b.order_id IS NULL);

Be cautious: matching NULL keys can create large “everything matches everything” effects if many rows have NULL keys. In reconciliation, it is often better to treat NULL keys as inherently unmatched and investigate why the key is missing.

Filtering after a FULL JOIN: avoid turning it into a partial join

Another common pitfall is filtering on columns from one side in the WHERE clause, which removes rows where that side is NULL. If you want to keep unmatched rows, either:

Filter in the join condition (ON) when it is logically part of matching, or
Filter each table before joining (CTEs/subqueries), or
Write WHERE conditions that explicitly include NULL cases.

For example, to keep order-only rows while filtering invoices by date, you could write:

SELECT
  COALESCE(a.order_id, b.order_id) AS order_id,
  a.order_total,
  b.invoiced_total,
  b.invoice_date
FROM app_orders a
FULL OUTER JOIN invoices b
  ON a.order_id = b.order_id
WHERE b.invoice_date >= DATE '2026-01-01'
   OR b.invoice_date IS NULL;

This preserves rows where b is missing, but it still excludes invoice-only rows with older invoice dates. If you need symmetrical filtering, filter both sides before the join.

Emulating FULL JOIN in databases that lack it

Some SQL dialects historically did not support FULL OUTER JOIN (for example, older MySQL versions). In those environments, you can emulate a FULL JOIN using a UNION of a left join and a right join (or two left joins).

Conceptually, you want:

All rows from A with matches from B (and A-only rows)
Plus all rows from B that did not match A (the B-only rows)

One common pattern is:

-- Emulated FULL OUTER JOIN
SELECT
  a.order_id,
  a.customer_id,
  a.order_total,
  b.invoice_id,
  b.invoiced_total
FROM app_orders a
LEFT JOIN invoices b
  ON a.order_id = b.order_id

UNION ALL

SELECT
  b.order_id,
  a.customer_id,
  a.order_total,
  b.invoice_id,
  b.invoiced_total
FROM invoices b
LEFT JOIN app_orders a
  ON a.order_id = b.order_id
WHERE a.order_id IS NULL;

Notes for this approach:

UNION ALL is used to avoid unnecessary deduplication work; the WHERE a.order_id IS NULL ensures you only add the B-only rows in the second half.
Make sure the selected columns align by position and type across both SELECT statements.
If your join key can be NULL and you are using the NULL-matching workaround, you must apply it consistently in both halves.

FULL JOIN for “master list” building and gap filling

FULL JOIN is also useful when you want to build a “master list” of entities that may exist in either system. For example, suppose you have:

crm_customers(customer_id, email, status)
newsletter_subscribers(email, subscribed_at)

Some customers are not subscribed; some subscribers are not customers (leads). You can FULL JOIN on email to get a combined view.

SELECT
  COALESCE(c.email, n.email) AS email,
  c.customer_id,
  c.status,
  n.subscribed_at,
  CASE
    WHEN c.email IS NOT NULL AND n.email IS NOT NULL THEN 'customer_and_subscriber'
    WHEN c.email IS NOT NULL AND n.email IS NULL THEN 'customer_only'
    ELSE 'subscriber_only'
  END AS segment
FROM crm_customers c
FULL OUTER JOIN newsletter_subscribers n
  ON c.email = n.email;

This pattern is effective for building outreach lists, identifying leads to import into CRM, or finding customers missing from a marketing channel.

Performance and indexing considerations specific to FULL JOIN usage

FULL JOIN can be more expensive than joins that preserve only one side because the database must account for unmatched rows from both tables. Practical tips:

Index the join keys on both tables (or ensure they are part of a primary/unique key where appropriate). This helps the optimizer find matches efficiently.
Reduce input size before joining using CTEs/subqueries (date windows, status filters, relevant columns). Joining fewer rows is often the biggest win.
Aggregate before joining if you only need results at a higher grain (for example, totals per order). This avoids row multiplication and reduces join work.
Select only needed columns in intermediate steps; wide rows increase memory and I/O.

If you are using FULL JOIN primarily to find mismatches, consider whether you can compute match keys first (for example, distinct keys from each side) and then join those smaller sets, pulling details only for the exceptions.

Checklist: validating that your FULL JOIN result matches your intent

Do you have a single, correct join key (or composite key) that represents the entity you are reconciling?
Is the key unique on each side at the grain you want? If not, did you aggregate first?
Are you unintentionally filtering out unmatched rows with a WHERE clause?
Did you label matched vs. unmatched rows using a CASE expression?
Did you use COALESCE to create a unified identifier for grouping and display?
Have you considered how NULL keys should behave (unmatched vs. matched)?

Now answer the exercise about the content:

In a reconciliation report built with a FULL OUTER JOIN between app_orders and invoices, what is a common reason to use COALESCE(a.order_id, b.order_id) as order_id?

You are right! Congratulations, now go to the next page

You missed! Try again.

FULL OUTER JOIN can produce rows where either a.order_id or b.order_id is NULL. COALESCE returns the first non-NULL value, giving one unified order_id for grouping and display.