Chapter 6: “Query Performance”
Now that we know how to write queries of increasing complexity we can now tackle the subject of what makes a query efficient. Writing correct SQL is only half the battle; the real challenge is writing efficient SQL. Slow queries can bottleneck apps, frustrate users, and burn cloud resources. This chapter teaches how to think like a database engine and write performant queries that scale.
Understanding the Query Lifecycle
When you run an SQL query, the database:
- Parses your SQL
- Creates a query plan (a roadmap for how to retrieve the data)
- Executes the plan
Knowing how the engine decides on a query plan helps you understand why some queries are fast and others aren't.
EXPLAIN ANALYZE SELECT * FROM students WHERE gpa > 3.5;
To see what the database is actually doing step by step.
Indexes: Speeding Up Lookups
An index is like a book's table of contents, it helps the database find rows faster, especially in large tables.
Creating an Index:
CREATE INDEX idx_gpa ON students(gpa)
This will speed up queries that filter by GPA
When Indexes Help
- WHERE clause filters
- JOIN conditions
- ORDER BY and GROUP BY (sometimes)
When Indexes Hurt
- INSERT, UPDATE, and DELETE become slower
- Indexes take up space and need maintenance
Reading EXPLAIN Plans
EXPLAIN shows how the query will execute. Look for:
- Seq Scan: A full table scan (slow for large tables)
- Index Scan: Using an index (faster)
- Nested Loop vs Hash Join vs Merge Join: Different join strategies, each with pros and cons
Interpret cost estimates to identify bottlenecks.
Query Rewriting Strategies
- Avoid SELECT * only fetch what you need
- Replace subqueries with joins if they’re slow
- Move complex logic to CTEs or views for readability and reuse
- Use LIMIT to reduce rows for dashboards
- Use materialized views to cache expensive joins and aggregates
Denormalization: A Tradeoff for Speed
Denormalizing means intentionally duplicating data to reduce joins and improve speed. It's common in analytics systems.
Example: Instead of this:
SELECT s.name, m.major_name FROM students s JOIN majors m ON s.major_id = m.major_id;
Store major_name directly in students if the data doesn't change often.
Partitioning Large Tables
Partitioning splits a large table into smaller pieces for performance and maintenance.
- Range partitioning: by date, ID ranges, etc.
- List partitioning: by category or region
PostgreSQL supports native partitioning:
CREATE TABLE logs_2025 PARTITION OF logs FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
Performance is key to working with real-world data. In the next chapter, we’ll look at data modeling and preparation, the design choices that set you up for long-term success with scalable databases.