Chapter 6: “Query Performance”

Now that we know how to write queries of increasing complexity we can now tackle the subject of what makes a query efficient. Writing correct SQL is only half the battle; the real challenge is writing efficient SQL. Slow queries can bottleneck apps, frustrate users, and burn cloud resources. This chapter teaches how to think like a database engine and write performant queries that scale.

Understanding the Query Lifecycle

When you run an SQL query, the database:

Parses your SQL
Creates a query plan (a roadmap for how to retrieve the data)
Executes the plan

Knowing how the engine decides on a query plan helps you understand why some queries are fast and others aren't.

EXPLAIN ANALYZE SELECT * FROM students WHERE gpa > 3.5;

To see what the database is actually doing step by step.

Indexes: Speeding Up Lookups

An index is like a book's table of contents, it helps the database find rows faster, especially in large tables.

Creating an Index:

CREATE INDEX idx_gpa ON students(gpa)

This will speed up queries that filter by GPA

When Indexes Help

WHERE clause filters
JOIN conditions
ORDER BY and GROUP BY (sometimes)

When Indexes Hurt

INSERT, UPDATE, and DELETE become slower
Indexes take up space and need maintenance

Reading EXPLAIN Plans

EXPLAIN shows how the query will execute. Look for:

Seq Scan: A full table scan (slow for large tables)
Index Scan: Using an index (faster)
Nested Loop vs Hash Join vs Merge Join: Different join strategies, each with pros and cons

Interpret cost estimates to identify bottlenecks.

Query Rewriting Strategies

Avoid SELECT * only fetch what you need
Replace subqueries with joins if they’re slow
Move complex logic to CTEs or views for readability and reuse
Use LIMIT to reduce rows for dashboards
Use materialized views to cache expensive joins and aggregates

Denormalization: A Tradeoff for Speed

Denormalizing means intentionally duplicating data to reduce joins and improve speed. It's common in analytics systems.

Example: Instead of this:

SELECT s.name, m.major_name
FROM students s JOIN majors m ON s.major_id = m.major_id;

Store major_name directly in students if the data doesn't change often.

Partitioning Large Tables

Partitioning splits a large table into smaller pieces for performance and maintenance.

Range partitioning: by date, ID ranges, etc.
List partitioning: by category or region

PostgreSQL supports native partitioning:

CREATE TABLE logs_2025 PARTITION OF logs
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

Performance is key to working with real-world data. In the next chapter, we’ll look at data modeling and preparation, the design choices that set you up for long-term success with scalable databases.

Next: Chapter 7 →