Name: Sql Optimization Patterns
Author: Wshobson
Install
Terminal · npx
$npx skills add https://github.com/wshobson/agents --skill sql-optimization-patterns
Works with Paperclip
How Sql Optimization Patterns fits into a Paperclip company.

Sql Optimization Patterns drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore pack
Source file
SKILL.md499 linesmarkdown
Expand
1---2name: sql-optimization-patterns3description: Master SQL query optimization, indexing strategies, and EXPLAIN analysis to dramatically improve database performance and eliminate slow queries. Use when debugging slow queries, designing database schemas, or optimizing application performance.4---5 6# SQL Optimization Patterns7 8Transform slow database queries into lightning-fast operations through systematic optimization, proper indexing, and query plan analysis.9 10## When to Use This Skill11 12- Debugging slow-running queries13- Designing performant database schemas14- Optimizing application response times15- Reducing database load and costs16- Improving scalability for growing datasets17- Analyzing EXPLAIN query plans18- Implementing efficient indexes19- Resolving N+1 query problems20 21## Core Concepts22 23### 1. Query Execution Plans (EXPLAIN)24 25Understanding EXPLAIN output is fundamental to optimization.26 27**PostgreSQL EXPLAIN:**28 29```sql30-- Basic explain31EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';32 33-- With actual execution stats34EXPLAIN ANALYZE35SELECT * FROM users WHERE email = 'user@example.com';36 37-- Verbose output with more details38EXPLAIN (ANALYZE, BUFFERS, VERBOSE)39SELECT u.*, o.order_total40FROM users u41JOIN orders o ON u.id = o.user_id42WHERE u.created_at > NOW() - INTERVAL '30 days';43```44 45**Key Metrics to Watch:**46 47- **Seq Scan**: Full table scan (usually slow for large tables)48- **Index Scan**: Using index (good)49- **Index Only Scan**: Using index without touching table (best)50- **Nested Loop**: Join method (okay for small datasets)51- **Hash Join**: Join method (good for larger datasets)52- **Merge Join**: Join method (good for sorted data)53- **Cost**: Estimated query cost (lower is better)54- **Rows**: Estimated rows returned55- **Actual Time**: Real execution time56 57### 2. Index Strategies58 59Indexes are the most powerful optimization tool.60 61**Index Types:**62 63- **B-Tree**: Default, good for equality and range queries64- **Hash**: Only for equality (=) comparisons65- **GIN**: Full-text search, array queries, JSONB66- **GiST**: Geometric data, full-text search67- **BRIN**: Block Range INdex for very large tables with correlation68 69```sql70-- Standard B-Tree index71CREATE INDEX idx_users_email ON users(email);72 73-- Composite index (order matters!)74CREATE INDEX idx_orders_user_status ON orders(user_id, status);75 76-- Partial index (index subset of rows)77CREATE INDEX idx_active_users ON users(email)78WHERE status = 'active';79 80-- Expression index81CREATE INDEX idx_users_lower_email ON users(LOWER(email));82 83-- Covering index (include additional columns)84CREATE INDEX idx_users_email_covering ON users(email)85INCLUDE (name, created_at);86 87-- Full-text search index88CREATE INDEX idx_posts_search ON posts89USING GIN(to_tsvector('english', title || ' ' || body));90 91-- JSONB index92CREATE INDEX idx_metadata ON events USING GIN(metadata);93```94 95### 3. Query Optimization Patterns96 97**Avoid SELECT \*:**98 99```sql100-- Bad: Fetches unnecessary columns101SELECT * FROM users WHERE id = 123;102 103-- Good: Fetch only what you need104SELECT id, email, name FROM users WHERE id = 123;105```106 107**Use WHERE Clause Efficiently:**108 109```sql110-- Bad: Function prevents index usage111SELECT * FROM users WHERE LOWER(email) = 'user@example.com';112 113-- Good: Create functional index or use exact match114CREATE INDEX idx_users_email_lower ON users(LOWER(email));115-- Then:116SELECT * FROM users WHERE LOWER(email) = 'user@example.com';117 118-- Or store normalized data119SELECT * FROM users WHERE email = 'user@example.com';120```121 122**Optimize JOINs:**123 124```sql125-- Bad: Cartesian product then filter126SELECT u.name, o.total127FROM users u, orders o128WHERE u.id = o.user_id AND u.created_at > '2024-01-01';129 130-- Good: Filter before join131SELECT u.name, o.total132FROM users u133JOIN orders o ON u.id = o.user_id134WHERE u.created_at > '2024-01-01';135 136-- Better: Filter both tables137SELECT u.name, o.total138FROM (SELECT * FROM users WHERE created_at > '2024-01-01') u139JOIN orders o ON u.id = o.user_id;140```141 142## Optimization Patterns143 144### Pattern 1: Eliminate N+1 Queries145 146**Problem: N+1 Query Anti-Pattern**147 148```python149# Bad: Executes N+1 queries150users = db.query("SELECT * FROM users LIMIT 10")151for user in users:152    orders = db.query("SELECT * FROM orders WHERE user_id = ?", user.id)153    # Process orders154```155 156**Solution: Use JOINs or Batch Loading**157 158```sql159-- Solution 1: JOIN160SELECT161    u.id, u.name,162    o.id as order_id, o.total163FROM users u164LEFT JOIN orders o ON u.id = o.user_id165WHERE u.id IN (1, 2, 3, 4, 5);166 167-- Solution 2: Batch query168SELECT * FROM orders169WHERE user_id IN (1, 2, 3, 4, 5);170```171 172```python173# Good: Single query with JOIN or batch load174# Using JOIN175results = db.query("""176    SELECT u.id, u.name, o.id as order_id, o.total177    FROM users u178    LEFT JOIN orders o ON u.id = o.user_id179    WHERE u.id IN (1, 2, 3, 4, 5)180""")181 182# Or batch load183users = db.query("SELECT * FROM users LIMIT 10")184user_ids = [u.id for u in users]185orders = db.query(186    "SELECT * FROM orders WHERE user_id IN (?)",187    user_ids188)189# Group orders by user_id190orders_by_user = {}191for order in orders:192    orders_by_user.setdefault(order.user_id, []).append(order)193```194 195### Pattern 2: Optimize Pagination196 197**Bad: OFFSET on Large Tables**198 199```sql200-- Slow for large offsets201SELECT * FROM users202ORDER BY created_at DESC203LIMIT 20 OFFSET 100000;  -- Very slow!204```205 206**Good: Cursor-Based Pagination**207 208```sql209-- Much faster: Use cursor (last seen ID)210SELECT * FROM users211WHERE created_at < '2024-01-15 10:30:00'  -- Last cursor212ORDER BY created_at DESC213LIMIT 20;214 215-- With composite sorting216SELECT * FROM users217WHERE (created_at, id) < ('2024-01-15 10:30:00', 12345)218ORDER BY created_at DESC, id DESC219LIMIT 20;220 221-- Requires index222CREATE INDEX idx_users_cursor ON users(created_at DESC, id DESC);223```224 225### Pattern 3: Aggregate Efficiently226 227**Optimize COUNT Queries:**228 229```sql230-- Bad: Counts all rows231SELECT COUNT(*) FROM orders;  -- Slow on large tables232 233-- Good: Use estimates for approximate counts234SELECT reltuples::bigint AS estimate235FROM pg_class236WHERE relname = 'orders';237 238-- Good: Filter before counting239SELECT COUNT(*) FROM orders240WHERE created_at > NOW() - INTERVAL '7 days';241 242-- Better: Use index-only scan243CREATE INDEX idx_orders_created ON orders(created_at);244SELECT COUNT(*) FROM orders245WHERE created_at > NOW() - INTERVAL '7 days';246```247 248**Optimize GROUP BY:**249 250```sql251-- Bad: Group by then filter252SELECT user_id, COUNT(*) as order_count253FROM orders254GROUP BY user_id255HAVING COUNT(*) > 10;256 257-- Better: Filter first, then group (if possible)258SELECT user_id, COUNT(*) as order_count259FROM orders260WHERE status = 'completed'261GROUP BY user_id262HAVING COUNT(*) > 10;263 264-- Best: Use covering index265CREATE INDEX idx_orders_user_status ON orders(user_id, status);266```267 268### Pattern 4: Subquery Optimization269 270**Transform Correlated Subqueries:**271 272```sql273-- Bad: Correlated subquery (runs for each row)274SELECT u.name, u.email,275    (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) as order_count276FROM users u;277 278-- Good: JOIN with aggregation279SELECT u.name, u.email, COUNT(o.id) as order_count280FROM users u281LEFT JOIN orders o ON o.user_id = u.id282GROUP BY u.id, u.name, u.email;283 284-- Better: Use window functions285SELECT DISTINCT ON (u.id)286    u.name, u.email,287    COUNT(o.id) OVER (PARTITION BY u.id) as order_count288FROM users u289LEFT JOIN orders o ON o.user_id = u.id;290```291 292**Use CTEs for Clarity:**293 294```sql295-- Using Common Table Expressions296WITH recent_users AS (297    SELECT id, name, email298    FROM users299    WHERE created_at > NOW() - INTERVAL '30 days'300),301user_order_counts AS (302    SELECT user_id, COUNT(*) as order_count303    FROM orders304    WHERE created_at > NOW() - INTERVAL '30 days'305    GROUP BY user_id306)307SELECT ru.name, ru.email, COALESCE(uoc.order_count, 0) as orders308FROM recent_users ru309LEFT JOIN user_order_counts uoc ON ru.id = uoc.user_id;310```311 312### Pattern 5: Batch Operations313 314**Batch INSERT:**315 316```sql317-- Bad: Multiple individual inserts318INSERT INTO users (name, email) VALUES ('Alice', 'alice@example.com');319INSERT INTO users (name, email) VALUES ('Bob', 'bob@example.com');320INSERT INTO users (name, email) VALUES ('Carol', 'carol@example.com');321 322-- Good: Batch insert323INSERT INTO users (name, email) VALUES324    ('Alice', 'alice@example.com'),325    ('Bob', 'bob@example.com'),326    ('Carol', 'carol@example.com');327 328-- Better: Use COPY for bulk inserts (PostgreSQL)329COPY users (name, email) FROM '/tmp/users.csv' CSV HEADER;330```331 332**Batch UPDATE:**333 334```sql335-- Bad: Update in loop336UPDATE users SET status = 'active' WHERE id = 1;337UPDATE users SET status = 'active' WHERE id = 2;338-- ... repeat for many IDs339 340-- Good: Single UPDATE with IN clause341UPDATE users342SET status = 'active'343WHERE id IN (1, 2, 3, 4, 5, ...);344 345-- Better: Use temporary table for large batches346CREATE TEMP TABLE temp_user_updates (id INT, new_status VARCHAR);347INSERT INTO temp_user_updates VALUES (1, 'active'), (2, 'active'), ...;348 349UPDATE users u350SET status = t.new_status351FROM temp_user_updates t352WHERE u.id = t.id;353```354 355## Advanced Techniques356 357### Materialized Views358 359Pre-compute expensive queries.360 361```sql362-- Create materialized view363CREATE MATERIALIZED VIEW user_order_summary AS364SELECT365    u.id,366    u.name,367    COUNT(o.id) as total_orders,368    SUM(o.total) as total_spent,369    MAX(o.created_at) as last_order_date370FROM users u371LEFT JOIN orders o ON u.id = o.user_id372GROUP BY u.id, u.name;373 374-- Add index to materialized view375CREATE INDEX idx_user_summary_spent ON user_order_summary(total_spent DESC);376 377-- Refresh materialized view378REFRESH MATERIALIZED VIEW user_order_summary;379 380-- Concurrent refresh (PostgreSQL)381REFRESH MATERIALIZED VIEW CONCURRENTLY user_order_summary;382 383-- Query materialized view (very fast)384SELECT * FROM user_order_summary385WHERE total_spent > 1000386ORDER BY total_spent DESC;387```388 389### Partitioning390 391Split large tables for better performance.392 393```sql394-- Range partitioning by date (PostgreSQL)395CREATE TABLE orders (396    id SERIAL,397    user_id INT,398    total DECIMAL,399    created_at TIMESTAMP400) PARTITION BY RANGE (created_at);401 402-- Create partitions403CREATE TABLE orders_2024_q1 PARTITION OF orders404    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');405 406CREATE TABLE orders_2024_q2 PARTITION OF orders407    FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');408 409-- Queries automatically use appropriate partition410SELECT * FROM orders411WHERE created_at BETWEEN '2024-02-01' AND '2024-02-28';412-- Only scans orders_2024_q1 partition413```414 415### Query Hints and Optimization416 417```sql418-- Force index usage (MySQL)419SELECT * FROM users420USE INDEX (idx_users_email)421WHERE email = 'user@example.com';422 423-- Parallel query (PostgreSQL)424SET max_parallel_workers_per_gather = 4;425SELECT * FROM large_table WHERE condition;426 427-- Join hints (PostgreSQL)428SET enable_nestloop = OFF;  -- Force hash or merge join429```430 431## Best Practices432 4331. **Index Selectively**: Too many indexes slow down writes4342. **Monitor Query Performance**: Use slow query logs4353. **Keep Statistics Updated**: Run ANALYZE regularly4364. **Use Appropriate Data Types**: Smaller types = better performance4375. **Normalize Thoughtfully**: Balance normalization vs performance4386. **Cache Frequently Accessed Data**: Use application-level caching4397. **Connection Pooling**: Reuse database connections4408. **Regular Maintenance**: VACUUM, ANALYZE, rebuild indexes441 442```sql443-- Update statistics444ANALYZE users;445ANALYZE VERBOSE orders;446 447-- Vacuum (PostgreSQL)448VACUUM ANALYZE users;449VACUUM FULL users;  -- Reclaim space (locks table)450 451-- Reindex452REINDEX INDEX idx_users_email;453REINDEX TABLE users;454```455 456## Common Pitfalls457 458- **Over-Indexing**: Each index slows down INSERT/UPDATE/DELETE459- **Unused Indexes**: Waste space and slow writes460- **Missing Indexes**: Slow queries, full table scans461- **Implicit Type Conversion**: Prevents index usage462- **OR Conditions**: Can't use indexes efficiently463- **LIKE with Leading Wildcard**: `LIKE '%abc'` can't use index464- **Function in WHERE**: Prevents index usage unless functional index exists465 466## Monitoring Queries467 468```sql469-- Find slow queries (PostgreSQL)470SELECT query, calls, total_time, mean_time471FROM pg_stat_statements472ORDER BY mean_time DESC473LIMIT 10;474 475-- Find missing indexes (PostgreSQL)476SELECT477    schemaname,478    tablename,479    seq_scan,480    seq_tup_read,481    idx_scan,482    seq_tup_read / seq_scan AS avg_seq_tup_read483FROM pg_stat_user_tables484WHERE seq_scan > 0485ORDER BY seq_tup_read DESC486LIMIT 10;487 488-- Find unused indexes (PostgreSQL)489SELECT490    schemaname,491    tablename,492    indexname,493    idx_scan,494    idx_tup_read,495    idx_tup_fetch496FROM pg_stat_user_indexes497WHERE idx_scan = 0498ORDER BY pg_relation_size(indexrelid) DESC;499```
Related skills
Accessibility Compliance

This walks you through implementing proper WCAG 2.2 compliance with real code patterns for screen readers, keyboard navigation, and mobile accessibility. It cov
Airflow Dag Patterns

If you're building data pipelines with Airflow, this skill gives you production-ready DAG patterns that actually work in the real world. It covers TaskFlow API
Angular Migration

Migrating from AngularJS to Angular is notoriously painful, and this skill tackles the practical stuff that makes or breaks these projects. It covers hybrid app