Free Ebook cover SQL Fundamentals for Absolute Beginners (Any Database)

SQL Fundamentals for Absolute Beginners (Any Database)

New course

8 pages

Combining Tables with Simple Joins

Capítulo 6

Estimated reading time: 13 minutes

+ Exercise

Why Joins Matter

In a relational database, information is often split across multiple tables so each table can focus on one “type” of thing: customers, orders, products, employees, and so on. A join is how you recombine those related pieces at query time to answer real questions like “Which customer placed this order?” or “What products are included in each order?”

A join works by matching rows from one table to rows in another table using a relationship between columns. Most commonly, you match a foreign key in one table to a primary key in another table. The result is a single result set that contains columns from both tables, aligned row-by-row based on the match condition.

A Simple Data Model to Practice With

To make joins concrete, we’ll use a small, typical sales schema. You do not need to create these tables now; focus on how the joins work. Imagine the following tables and columns exist:

  • customers: customer_id, first_name, last_name, email
  • orders: order_id, customer_id, order_date, status
  • order_items: order_id, product_id, quantity, unit_price
  • products: product_id, product_name, category

Relationships (conceptually): orders.customer_id refers to customers.customer_id; order_items.order_id refers to orders.order_id; order_items.product_id refers to products.product_id.

Core Join Vocabulary

Left table and right table

In a join, you typically start with one table in the FROM clause (often called the “left” table) and then add another table with JOIN (the “right” table). The terms “left” and “right” matter most when you use outer joins (like LEFT JOIN), but they’re useful labels in general.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

Join condition

The join condition is the rule that says which rows match. It’s written after ON, usually as an equality between key columns, for example orders.customer_id = customers.customer_id.

Result set

The output of a join is not a new stored table (unless you explicitly create one). It’s a query result that can be used directly, filtered further, aggregated, or joined again.

INNER JOIN: Keep Only Matching Rows

INNER JOIN returns only rows where the join condition finds a match in both tables. If an order has a customer_id that does not match any customer row (which ideally shouldn’t happen if constraints are enforced), that order would not appear in the result.

Example: Show each order with the customer name

SELECT o.order_id, o.order_date, c.first_name, c.last_name
FROM orders AS o
INNER JOIN customers AS c
  ON o.customer_id = c.customer_id;

What happens step-by-step:

  • The database reads rows from orders (aliased as o).
  • For each order row, it looks for a row in customers (aliased as c) where o.customer_id = c.customer_id.
  • If a match is found, it outputs a combined row containing the selected columns from both tables.
  • If no match is found, that order row is excluded (because it’s an inner join).

Notice the use of table aliases (o, c). Aliases make queries shorter and clearer, especially when multiple tables have similarly named columns.

Common pitfall: ambiguous column names

If both tables have a column with the same name (for example, customer_id exists in both), you must qualify it with the table name or alias in the SELECT list and conditions. This avoids ambiguity and makes your intent explicit.

LEFT JOIN: Keep All Rows From the Left Table

LEFT JOIN (also called LEFT OUTER JOIN) returns all rows from the left table, and matching rows from the right table. If there is no match, the right table’s columns appear as NULL in the result.

Example: List all customers and any orders they have

SELECT c.customer_id, c.first_name, c.last_name, o.order_id, o.order_date
FROM customers AS c
LEFT JOIN orders AS o
  ON o.customer_id = c.customer_id;

Step-by-step:

  • Start with every row in customers.
  • Try to find matching rows in orders based on the ON condition.
  • If matches exist, output one row per match (a customer with multiple orders will appear multiple times).
  • If no match exists, still output the customer row, but with NULL for o.order_id and o.order_date.

This is the join you use when you want to include “things with zero related things,” such as customers with no orders, products never ordered, or employees with no assigned tasks.

Important: filtering a LEFT JOIN without accidentally turning it into an INNER JOIN

A frequent beginner mistake is to use a WHERE condition on a right-table column after a LEFT JOIN, which can remove the NULL rows and effectively behave like an inner join.

For example, if you do this:

SELECT c.customer_id, c.first_name, o.order_id, o.status
FROM customers AS c
LEFT JOIN orders AS o
  ON o.customer_id = c.customer_id
WHERE o.status = 'SHIPPED';

Customers with no orders have o.status as NULL, so they fail the WHERE condition and disappear. If your goal is “all customers, and show shipped orders when they exist,” move the filter into the ON clause:

SELECT c.customer_id, c.first_name, o.order_id, o.status
FROM customers AS c
LEFT JOIN orders AS o
  ON o.customer_id = c.customer_id
 AND o.status = 'SHIPPED';

Now the join tries to match only shipped orders, but customers still remain even if they have none.

Joining More Than Two Tables

Real queries often require chaining joins. You can join multiple tables by adding more JOIN ... ON ... clauses. Each join uses the result of the previous step as its “left side.”

Example: Show order lines with product names

Order items typically store product_id and pricing details, but not the product name. To show readable output, join order_items to products.

SELECT oi.order_id, p.product_name, oi.quantity, oi.unit_price
FROM order_items AS oi
INNER JOIN products AS p
  ON oi.product_id = p.product_id;

This returns one row per order line, with the product name attached.

Example: Show customer name, order id, and product name

This requires three joins: customers → orders → order_items → products (four tables total). The key is to join one relationship at a time.

SELECT c.first_name, c.last_name, o.order_id, p.product_name, oi.quantity
FROM customers AS c
INNER JOIN orders AS o
  ON o.customer_id = c.customer_id
INNER JOIN order_items AS oi
  ON oi.order_id = o.order_id
INNER JOIN products AS p
  ON p.product_id = oi.product_id;

Step-by-step thinking:

  • Start with customers and orders: this creates customer-order pairs.
  • Join order_items to attach each order’s line items.
  • Join products to translate product_id into product_name (and other product attributes).

Because order_items is a “many” side (an order can have many items), the result will have multiple rows per order—one per item.

Understanding Row Multiplication (Why Joins Can Create Many Rows)

Joins can increase the number of rows because of one-to-many relationships. If one customer has 3 orders, an inner join between customers and orders will produce 3 rows for that customer (one per order). If each of those orders has 4 items, joining in order_items can produce 12 rows for that customer across those orders.

This is not an error; it’s how relational data expands when you ask for detailed, line-level information. The key is to be aware of the “grain” (level of detail) of your query: are you returning one row per customer, per order, or per order item?

Choosing the Correct Join Condition

The join condition is the most important part of a join. A correct condition matches the intended relationship; an incorrect condition can silently produce wrong results.

Use the relationship columns

Typically, you join foreign key to primary key:

ON o.customer_id = c.customer_id

Avoid joining on names, emails, or other descriptive fields unless you have a strong reason and you know they are unique and stable. Descriptive fields can have duplicates, formatting differences, or changes over time.

Join on all necessary columns for composite relationships

Sometimes a relationship requires more than one column to uniquely identify the match (for example, if a table uses a composite key). In that case, your ON clause must include all parts of the relationship. Conceptually:

... ON t1.key_part1 = t2.key_part1
AND t1.key_part2 = t2.key_part2

If you omit a key part, you may match too many rows and inflate results.

Cross Joins and the Danger of Missing Join Conditions

If you join tables without a proper ON condition, you can accidentally create a Cartesian product: every row from the first table paired with every row from the second table. This can produce an enormous number of rows and misleading results.

Some databases allow an explicit CROSS JOIN to intentionally create all combinations (useful for generating calendars, test data, or combinations of options). But for typical business queries, a Cartesian product is usually a mistake.

Example: Explicit CROSS JOIN (all combinations)

SELECT c.customer_id, p.product_id
FROM customers AS c
CROSS JOIN products AS p;

This returns every customer-product pair. It is rarely what you want for sales reporting, but it illustrates what happens when there is no matching condition.

Self Joins: Joining a Table to Itself

A self join is when you join a table to itself using different aliases. This is useful when a table contains a relationship to another row in the same table, such as an employee table with a manager_id pointing to another employee.

Imagine a table:

  • employees: employee_id, first_name, last_name, manager_id

Example: Show employees and their managers

SELECT e.first_name AS employee_first_name,
       e.last_name  AS employee_last_name,
       m.first_name AS manager_first_name,
       m.last_name  AS manager_last_name
FROM employees AS e
LEFT JOIN employees AS m
  ON e.manager_id = m.employee_id;

Why LEFT JOIN? Top-level managers may have no manager_id (or it may be NULL). A left join keeps those employees in the output and shows NULL manager fields.

Practical Step-by-Step: Building a Join Query Safely

When you’re new to joins, it’s easy to get lost. A reliable approach is to build the query in small, testable steps.

Step 1: Start with the “main” table

Decide what each row in your final output represents. Suppose you want one row per order item (line). The main table is order_items.

SELECT oi.order_id, oi.product_id, oi.quantity, oi.unit_price
FROM order_items AS oi;

Step 2: Join the next table to add meaning

Add product names by joining to products:

SELECT oi.order_id, p.product_name, oi.quantity, oi.unit_price
FROM order_items AS oi
INNER JOIN products AS p
  ON oi.product_id = p.product_id;

Check: Does the row count seem reasonable? Do product names look correct? If you see duplicates that don’t make sense, re-check the join condition.

Step 3: Join upward to the parent entity

Now attach order information:

SELECT o.order_id, o.order_date, p.product_name, oi.quantity, oi.unit_price
FROM order_items AS oi
INNER JOIN orders AS o
  ON oi.order_id = o.order_id
INNER JOIN products AS p
  ON oi.product_id = p.product_id;

Notice we joined orders using oi.order_id = o.order_id. This is the natural relationship between order_items and orders.

Step 4: Join to customers for the final context

SELECT c.first_name, c.last_name,
       o.order_id, o.order_date,
       p.product_name, oi.quantity, oi.unit_price
FROM order_items AS oi
INNER JOIN orders AS o
  ON oi.order_id = o.order_id
INNER JOIN customers AS c
  ON o.customer_id = c.customer_id
INNER JOIN products AS p
  ON oi.product_id = p.product_id;

At each step, you’re adding columns and relationships while keeping control of the query’s “grain.”

Using Joins to Find Missing Related Data

Outer joins are especially useful for finding “missing” relationships, such as customers with no orders or products that have never been ordered.

Example: Products never ordered

Start with products (we want all products), then left join order_items. Products with no matching order_items will have NULLs on the order_items side.

SELECT p.product_id, p.product_name
FROM products AS p
LEFT JOIN order_items AS oi
  ON oi.product_id = p.product_id
WHERE oi.product_id IS NULL;

This pattern is common: LEFT JOIN plus a WHERE check for NULL on a right-table key column to identify rows with no matches.

Join Readability: Aliases and Column Naming

As joins grow, readability becomes critical.

  • Use short, meaningful aliases: customers c, orders o, order_items oi, products p.
  • Qualify columns: write o.order_date instead of order_date to avoid ambiguity and to make the source clear.
  • Rename output columns when needed: if you select two columns with the same name (like id), use AS to give them distinct names in the result.

Example: Clear naming in output

SELECT o.order_id,
       o.order_date,
       c.customer_id AS customer_id,
       c.email       AS customer_email
FROM orders AS o
INNER JOIN customers AS c
  ON o.customer_id = c.customer_id;

Where to Put Conditions: ON vs WHERE

With inner joins, conditions in ON and WHERE often produce the same final rows (because non-matching rows are removed anyway). With outer joins, the placement can change results.

  • Use ON for conditions that define which related rows should match (especially important with LEFT JOIN).
  • Use WHERE for conditions that should filter the final result set after the join is formed.

Example idea: “All customers, and only their 2026 orders if they have any.” Put the date condition in ON to keep customers with no 2026 orders.

SELECT c.customer_id, c.first_name, o.order_id, o.order_date
FROM customers AS c
LEFT JOIN orders AS o
  ON o.customer_id = c.customer_id
 AND o.order_date >= DATE '2026-01-01'
 AND o.order_date <  DATE '2027-01-01';

Note: date literal syntax varies by database. If your database doesn’t support DATE 'YYYY-MM-DD', you may need a different literal or a conversion function.

Quick Reference: Simple Join Types

  • INNER JOIN: returns only matching rows from both tables.
  • LEFT JOIN: returns all rows from the left table, plus matching rows from the right table; non-matches show NULLs on the right.
  • CROSS JOIN: returns all combinations of rows (Cartesian product).
  • Self join: joins a table to itself using aliases.

Now answer the exercise about the content:

You want to list all customers, including those with no orders, and show only orders with status SHIPPED when they exist. What is the correct way to apply the status filter?

You are right! Congratulations, now go to the next page

You missed! Try again.

With a LEFT JOIN, filtering a right-table column in WHERE removes NULL matches and acts like an inner join. Put o.status = SHIPPED in the ON clause to keep all customers while matching only shipped orders.

Next chapter

Summarizing Data with Aggregations and Grouping

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.