Free Ebook cover SQL Joins Demystified: Inner, Left, Right, Full, and Self Joins

SQL Joins Demystified: Inner, Left, Right, Full, and Self Joins

New course

12 pages

Inner Join: Keeping Only Confirmed Matches

Capítulo 2

Estimated reading time: 12 minutes

+ Exercise

What an INNER JOIN Does (and What It Refuses to Do)

An INNER JOIN returns only the rows where the join condition finds a match on both sides. If a row from either table cannot satisfy the join condition, it is excluded from the result set. This makes INNER JOIN the “confirmed matches only” join: it is strict, selective, and ideal when you want results that are valid only when related data exists in both tables.

In practical terms, INNER JOIN is the right tool when your question implicitly says “only where we can prove the relationship.” Examples include: orders that have a customer, payments that belong to an invoice, enrollments that correspond to an existing student and course, or shipments that correspond to an existing order.

Because INNER JOIN filters out non-matching rows, it is also a common source of “missing data surprises.” If your result set seems smaller than expected, INNER JOIN is often the reason: it is not losing rows randomly; it is enforcing the requirement that the relationship must be present.

Core Syntax and Reading the Query

The canonical form looks like this:

SELECT  columns
FROM    table_a AS a
INNER JOIN table_b AS b
        ON a.key = b.key;

Many SQL dialects allow you to omit the word INNER and write just JOIN. Functionally, JOIN is typically interpreted as INNER JOIN:

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

SELECT  columns
FROM    table_a a
JOIN    table_b b
        ON a.key = b.key;

To read an INNER JOIN in plain language: “Start with rows from table_a, match them to rows in table_b where the ON condition is true, and keep only the rows where a match exists.”

A Practical Dataset: Customers and Orders

Consider two tables:

  • customers(customer_id, full_name, status)
  • orders(order_id, customer_id, order_date, total_amount)

Suppose your business question is: “Show me orders with the customer’s name.” This requires a confirmed relationship between an order and a customer. If an order references a customer_id that doesn’t exist (bad data, deleted customer, import issue), you typically do not want that order in a report that includes customer details—because you cannot reliably display the customer’s name.

An INNER JOIN expresses that requirement directly:

SELECT  o.order_id,
        o.order_date,
        o.total_amount,
        c.full_name
FROM    orders o
INNER JOIN customers c
        ON c.customer_id = o.customer_id
ORDER BY o.order_date;

This returns only orders that have a matching customer row. Any order with a missing or invalid customer_id is excluded.

Step-by-step: Building the Query Safely

When you write joins in real work, it helps to build them incrementally so you can validate assumptions.

Step 1: Start with the base table. If your report is “orders with customer name,” start from orders:

SELECT  o.order_id, o.customer_id, o.order_date, o.total_amount
FROM    orders o
ORDER BY o.order_date;

Check row counts and scan for suspicious values (like NULL customer_id or unexpected ranges).

Step 2: Add the join with only the key columns. This helps you confirm the join condition is correct:

SELECT  o.order_id,
        o.customer_id,
        c.customer_id AS matched_customer_id
FROM    orders o
INNER JOIN customers c
        ON c.customer_id = o.customer_id;

If matched_customer_id always equals o.customer_id, your join key is aligned. If you see fewer rows than expected, that’s a signal that some orders do not have a matching customer.

Step 3: Add the descriptive columns you actually need.

SELECT  o.order_id,
        o.order_date,
        o.total_amount,
        c.full_name
FROM    orders o
INNER JOIN customers c
        ON c.customer_id = o.customer_id;

Step 4: Add filters carefully. Put join logic in ON and row filters in WHERE unless you have a specific reason to do otherwise. For INNER JOIN, ON vs WHERE often yields the same result, but keeping them conceptually separated prevents confusion when you later switch to an outer join.

SELECT  o.order_id,
        o.order_date,
        o.total_amount,
        c.full_name
FROM    orders o
INNER JOIN customers c
        ON c.customer_id = o.customer_id
WHERE   o.order_date >= DATE '2026-01-01'
ORDER BY o.order_date;

INNER JOIN Is a Filter: Understanding Row Loss

INNER JOIN does not “start with all rows and fill blanks.” It is closer to an intersection: only rows that satisfy the join condition survive. This is powerful, but it can also hide data quality issues.

To quantify what you are losing, you can compare counts:

-- Total orders
SELECT COUNT(*) AS total_orders
FROM orders;

-- Orders that have a matching customer
SELECT COUNT(*) AS matched_orders
FROM orders o
INNER JOIN customers c
        ON c.customer_id = o.customer_id;

If matched_orders is smaller, you have orders that do not match any customer. That might be acceptable (depending on your business rules) or it might indicate a data integrity problem you should fix.

Joining on Multiple Columns (Composite Keys)

Sometimes a relationship is identified by more than one column. For example, consider a multi-tenant system where IDs are only unique within a company:

  • customers(company_id, customer_id, full_name)
  • orders(company_id, order_id, customer_id, total_amount)

Here, joining only on customer_id could incorrectly match customers across companies. The INNER JOIN must confirm both parts of the relationship:

SELECT  o.order_id,
        o.total_amount,
        c.full_name
FROM    orders o
INNER JOIN customers c
        ON  c.company_id = o.company_id
        AND c.customer_id = o.customer_id;

This ensures matches are “confirmed” in the correct tenant scope.

INNER JOIN and Duplicates: When Matches Multiply

INNER JOIN can increase the number of rows if the join condition matches multiple rows on either side. This is not a bug; it is the logical result of having multiple matches.

Example: A customer can have many orders. If you join customers to orders, each customer row will appear once per order:

SELECT  c.customer_id,
        c.full_name,
        o.order_id,
        o.total_amount
FROM    customers c
INNER JOIN orders o
        ON o.customer_id = c.customer_id
ORDER BY c.customer_id, o.order_id;

If you expected one row per customer, you must either aggregate or choose a single order per customer (for example, the latest order). Aggregation example:

SELECT  c.customer_id,
        c.full_name,
        COUNT(o.order_id) AS order_count,
        SUM(o.total_amount) AS lifetime_value
FROM    customers c
INNER JOIN orders o
        ON o.customer_id = c.customer_id
GROUP BY c.customer_id, c.full_name
ORDER BY lifetime_value DESC;

This still uses INNER JOIN, but now the output is one row per customer with confirmed orders. Customers with zero orders are excluded because the join requires a match.

INNER JOIN with Additional Conditions in the ON Clause

The ON clause can include more than just key equality. You can add conditions that define what counts as a “confirmed match.” For example, suppose you only want to match customers who are active:

SELECT  o.order_id,
        o.order_date,
        c.full_name
FROM    orders o
INNER JOIN customers c
        ON  c.customer_id = o.customer_id
        AND c.status = 'ACTIVE';

This query excludes orders whose customers are not active, because those rows fail the join condition. Notice the subtle difference between putting c.status = 'ACTIVE' in ON versus WHERE:

  • With INNER JOIN, both approaches usually return the same rows.
  • However, placing match-defining rules in ON can make your intent clearer: “only join to active customers.”

That clarity becomes especially important if you later change the join type (for example, to LEFT JOIN) and want to preserve unmatched rows.

Three-Table INNER JOIN: Confirmed Matches Across a Chain

INNER JOIN becomes even more selective as you chain joins. Each additional INNER JOIN requires another confirmed match, further narrowing the result set.

Consider:

  • orders(order_id, customer_id)
  • order_items(order_id, product_id, quantity, unit_price)
  • products(product_id, product_name, category)

Question: “List line items with product names for confirmed orders.”

SELECT  o.order_id,
        p.product_name,
        oi.quantity,
        oi.unit_price,
        (oi.quantity * oi.unit_price) AS line_total
FROM    orders o
INNER JOIN order_items oi
        ON oi.order_id = o.order_id
INNER JOIN products p
        ON p.product_id = oi.product_id
ORDER BY o.order_id;

What gets excluded?

  • Orders with no items (no match in order_items).
  • Order items referencing missing products (no match in products).

This is often exactly what you want for a “confirmed catalog” report, but it can hide orphaned order items or incomplete orders. If you are auditing data quality, you might intentionally use INNER JOIN to isolate only valid chains, then separately search for broken links.

Step-by-step: Debugging a Multi-Join Query

When a three-table INNER JOIN returns fewer rows than expected, isolate where rows are being filtered out:

Step 1: Join the first pair and count.

SELECT COUNT(*) AS order_item_rows
FROM orders o
INNER JOIN order_items oi
        ON oi.order_id = o.order_id;

Step 2: Join in the third table and count again.

SELECT COUNT(*) AS order_item_with_product_rows
FROM orders o
INNER JOIN order_items oi
        ON oi.order_id = o.order_id
INNER JOIN products p
        ON p.product_id = oi.product_id;

If the second count drops, the missing matches are between order_items and products (likely invalid product_id values).

INNER JOIN vs Filtering with EXISTS

Sometimes you want to keep rows from one table only if a match exists in another table, but you do not need any columns from the other table. INNER JOIN can do that, but it may create duplicates if multiple matches exist.

Example: “Find customers who have at least one order.” If you write:

SELECT c.customer_id, c.full_name
FROM customers c
INNER JOIN orders o
        ON o.customer_id = c.customer_id;

You will get one row per order, not one row per customer. You could add DISTINCT:

SELECT DISTINCT c.customer_id, c.full_name
FROM customers c
INNER JOIN orders o
        ON o.customer_id = c.customer_id;

Or you can use EXISTS, which expresses “confirmed match exists” without multiplying rows:

SELECT c.customer_id, c.full_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

This chapter focuses on INNER JOIN, but it’s useful to recognize that INNER JOIN is not the only way to enforce “must have a match.” If you need columns from both tables, INNER JOIN is natural. If you only need to test existence, EXISTS can be simpler and avoids accidental duplication.

INNER JOIN and NULLs in Join Keys

Join conditions using equality (=) do not match NULL values. If orders.customer_id is NULL, then c.customer_id = o.customer_id is not true for any customer row, so that order is excluded by an INNER JOIN.

This behavior is usually desirable: a NULL key means “unknown or not assigned,” which cannot be a confirmed match. If you find many rows being excluded due to NULL keys, treat it as a data issue to resolve (or reconsider whether INNER JOIN is the correct join type for your reporting need).

Common INNER JOIN Mistakes (and How to Avoid Them)

1) Joining on the wrong columns

Accidentally joining orders.order_id to customers.customer_id might still produce rows if numeric ranges overlap, but the results will be nonsense. Defensive habits help:

  • Use clear table aliases (o, c) and qualify columns.
  • Join on columns with matching meaning, not just matching data types.
  • Validate with a small sample: select both join keys side by side.
SELECT o.order_id, o.customer_id, c.customer_id
FROM orders o
INNER JOIN customers c
        ON c.customer_id = o.customer_id
FETCH FIRST 20 ROWS ONLY;

2) Accidental many-to-many joins

If you join two tables where neither side is unique on the join key, you can create a large multiplication of rows. Example: joining order_items to discounts on product_id when there are multiple discounts per product and multiple items per product. The result may explode.

Mitigations include:

  • Join to a deduplicated subquery (for example, the current discount only).
  • Add additional join conditions (date ranges, status flags) to make the match unique.
  • Aggregate one side before joining.

3) Putting relationship logic in WHERE without realizing future impact

With INNER JOIN, moving conditions between ON and WHERE often doesn’t change results, so it’s easy to develop sloppy habits. A good practice is:

  • ON: how rows match (keys and match-defining constraints).
  • WHERE: which matched rows you want to keep (filters for the final output).

This keeps queries readable and reduces surprises when you later refactor to other join types.

Practical Exercises (Write and Verify)

Exercise 1: Orders with confirmed customers

Write a query that returns order_id, order_date, total_amount, and full_name for orders placed in the last 30 days, but only when the customer exists and is active.

Checklist:

  • Start from orders.
  • INNER JOIN to customers on customer_id.
  • Require customers.status = 'ACTIVE' as part of the match or as a filter.
  • Filter by date in WHERE.

Exercise 2: Product sales lines with confirmed products

Write a query that lists each order line with product name and computed line total. Ensure that only lines with a valid product appear.

Checklist:

  • INNER JOIN order_items to products by product_id.
  • Compute quantity * unit_price as line_total.
  • Sort by order_id.

Exercise 3: Detecting how many rows INNER JOIN removes

Write two count queries: total rows in order_items, and rows in order_items that have a matching product in products. Compare the counts to estimate how many orphaned items exist.

SELECT COUNT(*) AS total_items
FROM order_items;

SELECT COUNT(*) AS matched_items
FROM order_items oi
INNER JOIN products p
        ON p.product_id = oi.product_id;

If matched_items is smaller, you have item rows that cannot be confirmed against the product catalog.

Now answer the exercise about the content:

Which result best describes what happens when you INNER JOIN orders to customers on customer_id?

You are right! Congratulations, now go to the next page

You missed! Try again.

An INNER JOIN keeps only confirmed matches where the ON condition succeeds on both sides. Rows without a match (including NULL join keys) are filtered out.

Next chapter

Left Join: Preserving the Primary Table While Filling Missing Matches

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.