Chapter 6: “Query Performance”

Now that we know how to write queries of increasing complexity we can now tackle the subject of what makes a query efficient. Writing correct SQL is only half the battle; the real challenge is writing efficient SQL. Slow queries can bottleneck apps, frustrate users, and burn cloud resources. This chapter teaches how to think like a database engine and write performant queries that scale.

Understanding the Query Lifecycle

When you run an SQL query, the database:

  • Parses your SQL
  • Creates a query plan (a roadmap for how to retrieve the data)
  • Executes the plan

Knowing how the engine decides on a query plan helps you understand why some queries are fast and others aren't.

EXPLAIN ANALYZE SELECT * FROM students WHERE gpa > 3.5;

To see what the database is actually doing step by step.

Indexes: Speeding Up Lookups

An index is like a book's table of contents, it helps the database find rows faster, especially in large tables.

Creating an Index:

CREATE INDEX idx_gpa ON students(gpa)

This will speed up queries that filter by GPA

When Indexes Help

  • WHERE clause filters
  • JOIN conditions
  • ORDER BY and GROUP BY (sometimes)

When Indexes Hurt

  • INSERT, UPDATE, and DELETE become slower
  • Indexes take up space and need maintenance

Reading EXPLAIN Plans

EXPLAIN shows how the query will execute. Look for:

  • Seq Scan: A full table scan (slow for large tables)
  • Index Scan: Using an index (faster)
  • Nested Loop vs Hash Join vs Merge Join: Different join strategies, each with pros and cons

Interpret cost estimates to identify bottlenecks.

Query Rewriting Strategies

  • Avoid SELECT * only fetch what you need
  • Replace subqueries with joins if they’re slow
  • Move complex logic to CTEs or views for readability and reuse
  • Use LIMIT to reduce rows for dashboards
  • Use materialized views to cache expensive joins and aggregates

Denormalization: A Tradeoff for Speed

Denormalizing means intentionally duplicating data to reduce joins and improve speed. It's common in analytics systems.

Example: Instead of this:

SELECT s.name, m.major_name
FROM students s JOIN majors m ON s.major_id = m.major_id;
    

Store major_name directly in students if the data doesn't change often.

Partitioning Large Tables

Partitioning splits a large table into smaller pieces for performance and maintenance.

  • Range partitioning: by date, ID ranges, etc.
  • List partitioning: by category or region

PostgreSQL supports native partitioning:

CREATE TABLE logs_2025 PARTITION OF logs
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
    

Performance is key to working with real-world data. In the next chapter, we’ll look at data modeling and preparation, the design choices that set you up for long-term success with scalable databases.