--- id: db-query-optimization title: Query Optimization — Index / Rewrite / 분리 category: Coding status: draft source_trust_level: B verification_status: conceptual created_at: 2026-05-09 updated_at: 2026-05-09 tags: [database, query, optimization, vibe-coding] tech_stack: { language: "SQL / Postgres", applicable_to: ["Backend"] } applied_in: [] aliases: [query optimization, SARGable, covering index, CTE, materialized view, denormalization] --- # Query Optimization > Index 가 일반 답. 그러나 **query rewrite, denormalization, materialized view, partition** 도 무기. SARGable predicate, covering index, CTE. ## 📖 핵심 개념 - SARGable: index 사용 가능한 predicate. - Covering index: query 가 필요한 모든 컬럼 포함. - Denormalization: read 위해 일부 중복. - Materialized view: 미리 계산. ## 💻 코드 패턴 ### SARGable rewrite ```sql -- ❌ Non-SARGable WHERE EXTRACT(YEAR FROM created_at) = 2026 WHERE LOWER(email) = 'a@b.com' WHERE id::text = '42' -- ✅ SARGable WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01' -- email = 'a@b.com' (index 가 case-insensitive 면) -- 또는 functional index CREATE INDEX users_email_lower ON users ((LOWER(email))); WHERE LOWER(email) = 'a@b.com' -- 이제 SARGable ``` ### Covering index ```sql -- 자주 query SELECT id, status FROM orders WHERE user_id = $1; -- ✅ Covering index — heap 접근 X (Index Only Scan) CREATE INDEX orders_user_covering ON orders (user_id) INCLUDE (id, status); ``` → Postgres `INCLUDE` (11+). ### Composite index — leftmost ```sql CREATE INDEX o_idx ON orders (user_id, status, created_at); -- ✅ 사용 WHERE user_id = $1 WHERE user_id = $1 AND status = 'paid' WHERE user_id = $1 AND status = 'paid' AND created_at > $2 -- ❌ Leading 안 맞음 WHERE status = 'paid' -- 새 인덱스 필요 WHERE created_at > $2 ``` ### Selectivity (cardinality) 우선 ```sql -- email (high cardinality, 1M unique) > status (3 unique) CREATE INDEX users (email, status); -- email 먼저 ``` → 첫 컬럼이 가장 selective. ### Partial index (조건부) ```sql -- 활성 user 만 자주 query CREATE INDEX users_active ON users (email) WHERE deleted_at IS NULL; SELECT * FROM users WHERE email = $1 AND deleted_at IS NULL; -- → 작은 인덱스, 빠름 ``` ### Expression index ```sql CREATE INDEX events_lower_event ON events (LOWER(event_type)); ``` ### Materialized view (자주 query, 가끔 새로고침) ```sql CREATE MATERIALIZED VIEW user_stats AS SELECT user_id, count(*) AS orders, sum(total) AS spent FROM orders GROUP BY user_id; CREATE UNIQUE INDEX user_stats_pk ON user_stats (user_id); -- 새로고침 REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats; ``` → 분 / 시간 마다 cron. ### Denormalization ```sql -- ❌ 매 read 가 join SELECT o.*, u.email FROM orders o JOIN users u ON o.user_id = u.id; -- ✅ orders 안에 email 복사 (immutable 또는 수용) ALTER TABLE orders ADD COLUMN user_email TEXT; -- INSERT 시 같이 채움 ``` → Write 비용 ↑ but read 큰 절약. ### CTE (WITH) ```sql WITH recent_orders AS ( SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '7 days' ) SELECT user_id, count(*) FROM recent_orders GROUP BY user_id; ``` ⚠️ Postgres 12+ = inline. 옛 PG = optimization barrier. ### LATERAL join (각 row 마다 다른 query) ```sql SELECT u.*, last_order.total FROM users u LEFT JOIN LATERAL ( SELECT * FROM orders o WHERE o.user_id = u.id ORDER BY created_at DESC LIMIT 1 ) last_order ON true; ``` → 각 user 의 마지막 order. Subquery 보다 효율. ### EXISTS vs IN ```sql -- ✅ EXISTS — short-circuit SELECT * FROM users WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = users.id); -- ⚠️ IN — 큰 list 면 hash SELECT * FROM users WHERE id IN (SELECT user_id FROM orders); ``` → 보통 같은 plan, but EXISTS 안 NULL 안전. ### Pagination — keyset > offset ```sql -- ❌ 큰 offset SELECT * FROM orders ORDER BY id DESC OFFSET 100000 LIMIT 20; -- ✅ Keyset SELECT * FROM orders WHERE id < $cursor ORDER BY id DESC LIMIT 20; ``` ### Batch (다중 row 한 query) ```sql -- ❌ N+1 for (id of ids) await db.query('SELECT * FROM users WHERE id = $1', [id]); -- ✅ Batch SELECT * FROM users WHERE id = ANY($1::uuid[]); ``` ```ts const users = await db.query('SELECT * FROM users WHERE id = ANY($1)', [ids]); ``` ### EXPLAIN reads ```sql EXPLAIN ANALYZE SELECT ...; -- "actual time" 가 일관 빠름인지 -- "Buffers: shared read=" 가 큰지 (디스크 I/O) -- "Rows Removed by Filter" 가 큰지 (인덱스 필요) ``` ### 통계 + ANALYZE ```sql ANALYZE orders; -- statistics 업데이트 -- autovacuum 가 보통 자동 — 큰 변경 후 명시적 도움 ``` ### Statistics extended ```sql -- 두 컬럼이 correlated CREATE STATISTICS s_user_status ON user_id, status FROM orders; ANALYZE orders; ``` → 더 정확한 row estimate. ### Index hint (Postgres pg_hint_plan extension) ```sql /*+ IndexScan(orders orders_user_idx) */ SELECT * FROM orders WHERE user_id = $1; ``` → 마지막 수단. 보통 ANALYZE / 더 좋은 index. ### N+1 in app ```ts // ❌ for (const user of users) { user.orders = await db.orders.findByUser(user.id); } // ✅ DataLoader / Prisma include / SQL JOIN const orders = await db.orders.findMany({ where: { userId: { in: userIds } } }); const byUser = groupBy(orders, 'userId'); users.forEach(u => u.orders = byUser[u.id] ?? []); ``` ## 🤔 의사결정 기준 | 패턴 | 사용 | |---|---| | 자주 read 같은 query | Index | | read 많고 write 적음 | Materialized view | | Read >> write 큰 차이 | Denormalize / CDC | | 부분 자주 | Partial index | | 큰 group by | Aggregating MV | | Top N per group | Window function / LATERAL | ## ❌ 안티패턴 - **Non-SARGable predicate**: index 사용 못 함. - **`SELECT *` + 큰 row**: I/O 큼. - **N+1 query**: app loop. JOIN / batch. - **모든 column index**: write 비용 ↑. - **Materialized view 안 refresh**: stale. - **CTE 가정 + 옛 PG (< 12)**: optimization barrier. - **OFFSET 큰 page**: 모든 row 읽음. ## 🤖 LLM 활용 힌트 - EXPLAIN ANALYZE 후 액션. - Index — composite + covering + partial. - Read 비싼 query = MV / denormalization. ## 🔗 관련 문서 - [[DB_Postgres_EXPLAIN]] - [[DB_Index_Strategy]] - [[DB_N_Plus_One]]