Chapter 3: “Joins and Subqueries”
Not all of the data that we have is stored in one table, it is typically normalized into a lot of different tables in order to increase query performance. This makes joining tables a very powerful tool to answer meaningful questions. SQL gives us two powerful tools to join tables: Joins which combine rows from two or more tables based on related columns. The other way is through Subqueries which nests one query inside of another query.
We will first begin by talking about JOINS which are the most common for beginners and provide a simple idea of how to join tables. Let’s say we have two tables: students(student_id, name, major_id) and majors(major_id, major_name). To get each student’s name along with their major name, we need to join these tables:
INNER JOIN
SELECT s.name, m.major_name FROM students s JOIN majors m ON s.major_id = m.major_id;
Returns only rows where there is a match in both tables.
LEFT JOIN (a.k.a. LEFT OUTER JOIN)
SELECT s.name, m.major_name FROM students s LEFT JOIN majors m ON s.major_id = m.major_id;
Returns all students, even if they don’t have a matching major.
RIGHT JOIN / FULL OUTER JOIN / Self JOIN
These are less commonly used but worth knowing:
- RIGHT JOIN: All records from the right table, and matched ones from the left.
- FULL OUTER JOIN: All records from both tables; unmatched ones will show NULL.
- SELF JOIN: When a table relates to itself, use a self join:
SELECT e1.name AS employee, e2.name AS manager FROM employees e1 JOIN employees e2 ON e1.manager_id = e2.employee_id;
Subqueries
A Subquery is a query nested within another query. You can use them in: SELECT, FROM, and WHERE
Subquery in SELECT:
SELECT name, (SELECT AVG(grade) FROM grades g WHERE g.student_id = s.student_id) AS avg_grade FROM students s;
Each row gets a calculated column using a related subquery.
Subquery in WHERE:
SELECT name FROM students WHERE student_id IN (SELECT student_id FROM grades WHERE grade > 90);
This filters students who have at least one high grade.
Subquery in FROM (Derived Table):
You can also place a subquery in the FROM clause to use it like a temporary table:
SELECT major, avg_grade FROM ( SELECT major, AVG(grade) AS avg_grade FROM students s JOIN grades g ON s.student_id = g.student_id GROUP BY major ) AS major_grades WHERE avg_grade > 85;
This query calculates the average grade per major and filters to only those with averages above 85.
Correlated Subqueries
A correlated subquery depends on the outer query for each row:
SELECT name FROM students s WHERE EXISTS ( SELECT 1 FROM grades g WHERE g.student_id = s.student_id AND g.grade > 90 );
Choosing Between Joins and Subqueries:
- Use joins when you want to bring together columns from different tables in the same row.
- Use subqueries when you want to use the results of one query as a condition or derived value for another.
- Both can be used for similar problems, but joins are often more efficient and readable.
Now that you can write multi-table queries and use nested logic, you’re ready for advanced tools that make your queries more modular and reusable: views, CTEs, and materialized views.