What a Self Join Is (and Why You’d Use It)
A self join is a join where the left and right sides come from the same physical table, but you treat them as two different logical tables by giving each side a different alias. This is useful whenever a row needs to be related to other rows in the same table: employees to managers, products to “similar products,” events to prior events, customers to referrals, categories to parent categories, and so on.
The key idea is that the table contains a relationship that points back into itself (directly or indirectly). In practice, that relationship is usually represented by a foreign key column that references the table’s own primary key (for example, employees.manager_id referencing employees.employee_id), or by a column that can be matched to another row’s column (for example, two rows share the same department_id but have different employee_id values).
Because both sides are the same table, clarity depends on two things: (1) aliases that make each role obvious, and (2) a join condition that precisely describes how one row relates to another. Without those, self joins can quickly become confusing or produce incorrect results (like pairing a row with itself or creating duplicate pairs).
Core Pattern: Aliases Create Two Roles
When you self join, you must alias the table at least twice. Each alias represents a role in the relationship. A common pattern is:
SELECT ...columns from both roles... FROM table_name AS role1 JOIN table_name AS role2 ON ...relationship between role1 and role2...Even though role1 and role2 refer to the same underlying table, SQL treats them as separate sources for the purpose of the query. This lets you select columns from each role independently and compare them.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
Example dataset: employees
Assume a table like:
employees( employee_id INT PRIMARY KEY, full_name VARCHAR(200), title VARCHAR(100), manager_id INT NULL, department_id INT, hire_date DATE)Here, manager_id stores the employee_id of the manager. That is a classic self-referencing relationship.
Step-by-Step: Employee to Manager (Most Common Self Join)
Step 1: Decide the two roles and name them clearly
We want each employee and their manager. The two roles are:
- e = employee (the person we’re listing)
- m = manager (the person referenced by
e.manager_id)
Step 2: Write the join condition that connects the roles
The relationship is: the employee’s manager_id equals the manager’s employee_id.
e.manager_id = m.employee_idStep 3: Choose join type based on whether every employee has a manager
Top-level employees (like a CEO) may have manager_id as NULL. If you want to keep them in the result, use a left join from employees to managers. If you only want employees who have a manager, use an inner join. In many org charts, keeping top-level employees is useful, so we’ll use a left join.
Step 4: Select columns with explicit aliasing
Use column aliases to make output readable and avoid ambiguity.
SELECT e.employee_id, e.full_name AS employee_name, e.title AS employee_title, m.full_name AS manager_name, m.title AS manager_title FROM employees AS e LEFT JOIN employees AS m ON e.manager_id = m.employee_id ORDER BY e.employee_id;This query reads naturally: “from employees as e, left join employees as m on e.manager_id = m.employee_id.” The aliases make it clear which columns belong to which role.
Preventing Common Self Join Mistakes
Mistake 1: Ambiguous column references
Because both sides share the same column names, you must qualify columns with the alias. This will fail or behave unexpectedly:
SELECT employee_id, full_name FROM employees e JOIN employees m ON manager_id = employee_id;It’s ambiguous: which employee_id is on the right side of the join condition? Always qualify:
... ON e.manager_id = m.employee_idMistake 2: Joining a row to itself unintentionally
Some self joins are based on “same group” matching (same department, same category, same date). In those cases, if you join on equality of the group key, you will match each row to itself unless you explicitly exclude it.
Example: pairing employees in the same department.
SELECT e1.full_name AS employee_1, e2.full_name AS employee_2, e1.department_id FROM employees AS e1 JOIN employees AS e2 ON e1.department_id = e2.department_id;This includes pairs like (Alice, Alice). To exclude self-pairs, add:
AND e1.employee_id <> e2.employee_idMistake 3: Duplicate pairs (A,B) and (B,A)
When you create pairs from the same table, you often want each pair only once. If you join employees to employees in the same department, you’ll get both (Alice, Bob) and (Bob, Alice). To keep only one ordering, enforce a consistent rule such as “left id is less than right id.”
SELECT e1.full_name AS employee_1, e2.full_name AS employee_2, e1.department_id FROM employees AS e1 JOIN employees AS e2 ON e1.department_id = e2.department_id AND e1.employee_id < e2.employee_id ORDER BY e1.department_id, e1.full_name, e2.full_name;This does two things at once: it removes self-pairs and removes mirrored duplicates.
Self Join for Hierarchies: Parent/Child Categories
Self joins are a natural fit for hierarchical data stored in a single table. Consider:
categories( category_id INT PRIMARY KEY, category_name VARCHAR(200), parent_id INT NULL)Each category may have a parent category. To list categories with their parent names:
SELECT c.category_id, c.category_name, p.category_name AS parent_category_name FROM categories AS c LEFT JOIN categories AS p ON c.parent_id = p.category_id ORDER BY c.category_name;Again, the aliases communicate roles: c is the child category, p is the parent category.
Adding a second level (grandparent) with multiple self joins
You can chain self joins to walk up the hierarchy a fixed number of levels. For example, category → parent → grandparent:
SELECT c.category_name AS category, p.category_name AS parent, gp.category_name AS grandparent FROM categories AS c LEFT JOIN categories AS p ON c.parent_id = p.category_id LEFT JOIN categories AS gp ON p.parent_id = gp.category_id ORDER BY c.category_name;This is still a self join conceptually, but you’re using the same table three times with three aliases. This approach is practical when you only need a small, known depth. For unknown depth, you typically need a recursive query (often a recursive CTE), which is a different technique than a basic join.
Self Join for “Previous Row” Comparisons (Same Table, Different Time)
Another common use is comparing a row to another row in the same table that represents a different moment in time: previous order, previous status change, previous price, previous login, etc. This can be done with a self join when you can define “previous” using a key or date.
Example dataset: product_prices
product_prices( product_id INT, valid_from DATE, price DECIMAL(10,2), PRIMARY KEY(product_id, valid_from))We want to compare each price record to the immediately prior price record for the same product. One approach is to self join each row to the maximum earlier valid_from date for that product.
Step-by-step:
- Alias
ppas the “current” price row. - Alias
prevas the “previous” price row. - Join on same product and ensure
prev.valid_fromis earlier thanpp.valid_from. - Pick only the closest earlier row using a correlated subquery in the join condition.
SELECT pp.product_id, pp.valid_from AS current_valid_from, pp.price AS current_price, prev.valid_from AS previous_valid_from, prev.price AS previous_price FROM product_prices AS pp LEFT JOIN product_prices AS prev ON prev.product_id = pp.product_id AND prev.valid_from = ( SELECT MAX(p2.valid_from) FROM product_prices AS p2 WHERE p2.product_id = pp.product_id AND p2.valid_from < pp.valid_from ) ORDER BY pp.product_id, pp.valid_from;This is a self join because product_prices appears twice, and the join condition relates each row to another row in the same table. The correlated subquery ensures you get the immediately prior record, not all earlier records.
Note: Many SQL dialects also support window functions like LAG for this pattern, but the self join approach is widely applicable and helps reinforce how to relate rows within a single table using explicit conditions.
Self Join for Finding “Peers” (Same Attribute, Different Entity)
Sometimes you want to relate entities that share an attribute. For example, customers in the same city, students in the same class, or employees with the same title. This is a self join where the join condition matches on the shared attribute and excludes the same entity.
Example: employees with the same title in the same department
We’ll pair employees who share both department_id and title. To avoid duplicates and self-pairs, we’ll use < on the primary key.
SELECT e1.department_id, e1.title, e1.full_name AS employee_1, e2.full_name AS employee_2 FROM employees AS e1 JOIN employees AS e2 ON e1.department_id = e2.department_id AND e1.title = e2.title AND e1.employee_id < e2.employee_id ORDER BY e1.department_id, e1.title, e1.full_name, e2.full_name;This produces unique peer pairs. If you instead want a list of each employee and the count of peers, you might self join and aggregate, but be careful: pairing creates multiple rows per employee. For counts, you can join and count distinct peers.
SELECT e1.employee_id, e1.full_name, COUNT(e2.employee_id) AS peer_count FROM employees AS e1 LEFT JOIN employees AS e2 ON e1.department_id = e2.department_id AND e1.title = e2.title AND e1.employee_id <> e2.employee_id GROUP BY e1.employee_id, e1.full_name ORDER BY peer_count DESC, e1.full_name;Here we keep all employees (left join) and count how many other employees match the peer criteria.
Self Join with Non-Equality Conditions (Range and Overlap Matching)
Self joins are not limited to equality. You can relate rows using inequality conditions, which is useful for detecting overlaps, gaps, or sequences.
Example dataset: room_bookings
room_bookings( booking_id INT PRIMARY KEY, room_id INT, start_ts TIMESTAMP, end_ts TIMESTAMP)To find overlapping bookings in the same room, self join bookings to bookings where time ranges overlap. Use an ordering rule to avoid duplicates.
SELECT b1.room_id, b1.booking_id AS booking_1, b2.booking_id AS booking_2, b1.start_ts AS booking_1_start, b1.end_ts AS booking_1_end, b2.start_ts AS booking_2_start, b2.end_ts AS booking_2_end FROM room_bookings AS b1 JOIN room_bookings AS b2 ON b1.room_id = b2.room_id AND b1.booking_id < b2.booking_id AND b1.start_ts < b2.end_ts AND b2.start_ts < b1.end_ts ORDER BY b1.room_id, b1.booking_id, b2.booking_id;The overlap logic (b1.start_ts < b2.end_ts and b2.start_ts < b1.end_ts) is a clear condition that relates two rows. The b1.booking_id < b2.booking_id rule prevents mirrored duplicates and self-matching.
Making Self Joins Readable: Alias and Column Naming Practices
Because self joins inherently reuse the same column names, readability is not optional. A few practical habits make a big difference:
- Use role-based aliases:
e/m(employee/manager),c/p(child/parent),pp/prev(current/previous),b1/b2(booking pair). - Alias output columns so the result set is self-explanatory:
employee_name,manager_name,previous_price. - Keep join conditions grouped by meaning: first match the entity grouping (same product, same room), then add the relationship rule (previous date, overlap), then add the de-duplication rule (
<). - Be explicit about nullability: if the relationship column can be
NULL(likemanager_id), choose a join type that matches your intent and expect nulls on the “other side” columns.
Performance and Correctness Considerations Specific to Self Joins
Indexes that matter
Self joins can be efficient if the relationship columns are indexed. Common helpful indexes include:
employees(employee_id)(usually already the primary key) andemployees(manager_id)for employee-manager lookups.categories(category_id)andcategories(parent_id)for hierarchy lookups.- Composite keys like
product_prices(product_id, valid_from)to support “previous row” matching. room_bookings(room_id, start_ts, end_ts)(or at leastroom_idplus one time column) for overlap detection.
Cardinality can grow quickly
Some self joins create many combinations. Pairing peers in a department can produce roughly N×(N-1)/2 rows per group. Overlap detection can also be heavy if many bookings exist per room. When you see unexpectedly large result sets, check whether your join condition is too broad (for example, matching only on department_id when you meant department_id and title), and ensure you’ve included de-duplication rules where appropriate.
Validate with small slices
When building a self join, it’s practical to test on a single entity or group first (one department, one product, one room) by adding a WHERE filter. This helps confirm that the join condition produces the intended pairs before running it across the whole table.
SELECT ... FROM room_bookings b1 JOIN room_bookings b2 ON ... WHERE b1.room_id = 101;