--- id: db-vacuum-bloat-deep title: Postgres Vacuum / Bloat — deep category: Coding status: draft source_trust_level: B verification_status: conceptual created_at: 2026-05-09 updated_at: 2026-05-09 tags: [database, postgres, vibe-coding] tech_stack: { language: "SQL", applicable_to: ["Database"] } applied_in: [] aliases: [VACUUM, autovacuum, bloat, dead tuple, freeze, wraparound, MVCC, pg_repack] --- # Postgres Vacuum / Bloat > Postgres MVCC = update / delete 가 dead row 누적. **Autovacuum 가 정리. Bloat = disk 폭발 + slow query**. Tuning 필수. ## 📖 핵심 개념 - MVCC: 매 update = 새 row + 옛 invisible. - Dead tuple: invisible row (저장 만 됨). - Vacuum: 정리. - Bloat: dead 가 live 보다 큰. ## 💻 코드 패턴 ### Bloat 측정 ```sql SELECT schemaname, relname, n_live_tup, n_dead_tup, round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS pct_dead FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 20; ``` ### Manual VACUUM ```sql VACUUM users; -- 정리, 공간 reuse VACUUM ANALYZE users; -- + statistics VACUUM FULL users; -- 재구성 (lock!) VACUUM VERBOSE users; -- 진행 보임 ``` → `VACUUM FULL` 가 `ACCESS EXCLUSIVE` lock — production 위험. ### Autovacuum 의 동작 ``` 매 N row 변경 시 (default 50 + 0.2 × table size). postgres.conf: autovacuum = on autovacuum_naptime = 1min autovacuum_vacuum_scale_factor = 0.2 # 20% autovacuum_vacuum_threshold = 50 # 50 row autovacuum_analyze_scale_factor = 0.1 # 10% ``` → 큰 table = 너무 늦게. ### Per-table tuning ```sql ALTER TABLE big_table SET ( autovacuum_vacuum_scale_factor = 0.05, -- 5% autovacuum_vacuum_threshold = 1000, autovacuum_analyze_scale_factor = 0.02 ); ``` → 큰 table 가 자주 vacuum. ### Heavy-update table ```sql -- 매 row 가 매번 update ALTER TABLE sessions SET ( autovacuum_vacuum_scale_factor = 0.01, -- 1% autovacuum_vacuum_cost_delay = 0 -- 빠른 ); ``` ### Dead tuple → bloat ``` 1M row table. 매 row update 5번 = 5M dead tuple. Disk = 6x. Index scan = 6x slower. → Vacuum 가 dead tuple 의 공간 reuse. 하지만 file size 안 작음 (file system 으로 안 반환). ``` ### VACUUM FULL (재구성) ```sql VACUUM FULL users; -- → 새 file 작성 + 옛 삭제. Disk 작음. -- → ACCESS EXCLUSIVE lock (production 안 됨). ``` → Maintenance window 만. ### pg_repack (live VACUUM FULL) ```bash pg_repack -d mydb -t users -- → Lock 없이 재구성. -- → 마지막 swap 만 잠시 lock. ``` → Production 친화. Postgres 의 alternative 답. ### Bloat 가 큰 root cause ``` 1. Autovacuum 가 못 따라 (slow / busy). 2. Long-running transaction (vacuum 가 dead 가 안 cleanup). 3. Replica 가 lag (hot_standby_feedback). 4. Large delete + no vacuum. ``` ### Long transaction 함정 ```sql -- Tx 가 시작 — vacuum 가 그 시점 의 view 보존. BEGIN; SELECT * FROM users LIMIT 1; -- ... 1 hour 후 ... COMMIT; -- 1 hour 동안 dead tuple 가 cleanup 안 됨. -- Bloat 가 누적. ``` ```sql -- Long tx 발견 SELECT pid, now() - xact_start AS duration, query FROM pg_stat_activity WHERE state != 'idle' AND xact_start IS NOT NULL ORDER BY duration DESC; ``` → App bug 가 흔한 cause. ### idle in transaction ```sql SELECT pid, now() - state_change AS duration, query FROM pg_stat_activity WHERE state = 'idle in transaction' AND now() - state_change > INTERVAL '1 minute'; ``` ```sql -- Auto kill ALTER SYSTEM SET idle_in_transaction_session_timeout = '60s'; ``` ### Wraparound (XID exhaustion) ``` XID = 4 byte (2^32 = 4B). 약 4B transaction 후 = wraparound. Vacuum 가 freeze 안 하면 = DB 멈춤 (read-only). ``` ```sql -- 매 table 의 XID age SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r' ORDER BY age(relfrozenxid) DESC LIMIT 10; -- 200M+ = warning. -- 1B+ = critical. ``` ### Vacuum freeze ```sql VACUUM FREEZE users; -- → 매 row 가 frozen (안전 from wraparound). ``` ``` autovacuum_freeze_max_age = 200_000_000 -- default -- 이 이상 = autovacuum 가 강제 freeze. ``` ### Autovacuum tuning ``` postgresql.conf: # 큰 table autovacuum_max_workers = 5 # default 3 autovacuum_naptime = 30s # default 1min # Cost-based autovacuum_vacuum_cost_delay = 10ms # default 20ms autovacuum_vacuum_cost_limit = 1000 # default 200 # → 빠른 vacuum. ``` ### maintenance_work_mem ``` Vacuum 의 memory. - Default: 64 MB. - 큰 table: 256MB - 1GB. postgresql.conf: maintenance_work_mem = 256MB ``` → Vacuum 가 빠름. ### Bloat 의 영향 ``` - Disk space 낭비. - Sequential scan 느림 (dead tuple skip). - Index scan 느림 (index 도 bloat). - WAL 큰 (vacuum 자체 가 WAL 남김). - Replica lag. ``` ### Index bloat ```sql -- pgstattuple CREATE EXTENSION pgstattuple; SELECT * FROM pgstattuple('users_pkey'); -- tuple_count, dead_tuple_count, free_space ``` ```sql -- Index 재구성 (Postgres 12+) REINDEX INDEX CONCURRENTLY users_pkey; ``` → `CONCURRENTLY` = lock 없음. ### pg_stat_user_indexes ```sql SELECT schemaname, relname, indexrelname, pg_size_pretty(pg_relation_size(indexrelid)) AS size, idx_scan FROM pg_stat_user_indexes ORDER BY pg_relation_size(indexrelid) DESC; ``` → 큰 index + 안 사용 = 삭제. ### TOAST bloat ``` 큰 column (text, jsonb, bytea) 가 TOAST table. 매 update = 새 TOAST row. → TOAST 도 vacuum. ``` ### HOT update (in-place) ``` Update 가: - Index column 안 변경 - Same page 의 free space → HOT update = dead tuple 작음. ``` ```sql -- HOT update 비율 SELECT relname, n_tup_upd, n_tup_hot_upd, round(100.0 * n_tup_hot_upd / NULLIF(n_tup_upd, 0), 2) AS hot_pct FROM pg_stat_user_tables; ``` → HOT % 가 낮음 = bloat 위험. ### fillfactor ```sql ALTER TABLE users SET (fillfactor = 90); -- Page 의 90% 만 fill — 10% 가 future update 위. -- → HOT update 친화. ``` ### Replica 의 hot_standby_feedback ``` Replica 가 long query → primary vacuum 가 wait. postgresql.conf (replica): hot_standby_feedback = on ``` → Bloat 위험. Off 가 default. ``` max_standby_streaming_delay = 30s -- 30s 후 = replica query 가 cancel. ``` ### pg_visibility (advanced) ```sql CREATE EXTENSION pg_visibility; SELECT * FROM pg_visibility('users'); -- 매 page 의 frozen / all_visible 상태. ``` ### Monitoring ``` - pg_stat_user_tables.n_dead_tup - pg_stat_user_tables.last_autovacuum - Bloat % (custom query) - Long transaction - XID age → Datadog / pgAnalyze / pgwatch. ``` ### 매주 vacuum ```sh # Cron 0 3 * * 0 psql -c 'VACUUM ANALYZE' ``` → Weekend 에 manual. (Autovacuum 가 보통 충분.) ### 큰 delete 후 ```sql DELETE FROM logs WHERE created_at < NOW() - INTERVAL '1 year'; -- → 큰 dead tuple. VACUUM logs; -- 또는 partition + drop: DROP TABLE logs_2025_01; ``` → Partition + drop 가 매우 빠름. ### Truncate (대안) ```sql TRUNCATE users; -- → 즉시. WAL 작음. Vacuum 안 필요. ``` → Empty 에 fast path. ### pg_squeeze (alternative) ```bash pg_squeeze --all -- → pg_repack 와 비슷, in-DB. ``` ### Partition 가 답 ```sql CREATE TABLE logs (...) PARTITION BY RANGE (created_at); CREATE TABLE logs_2026_05 PARTITION OF logs FOR VALUES FROM ('2026-05-01') TO ('2026-06-01'); -- 매월 새 partition. -- 옛 = DROP TABLE (instant). ``` → 큰 table 의 답. → [[DB_Partitioning_Patterns]]. ## 🤔 의사결정 기준 | 상황 | 추천 | |---|---| | 일반 | Autovacuum + tuning | | 큰 table heavy update | Per-table aggressive | | Bloat 누적 | pg_repack | | Maintenance window | VACUUM FULL | | Long history | Partition + drop | | Wraparound | Autofreeze | | Replica lag | hot_standby_feedback off | | Index bloat | REINDEX CONCURRENTLY | ## ❌ 안티패턴 - **VACUUM FULL on production**: full lock. - **Autovacuum off**: bloat 폭발 + wraparound. - **Long idle tx**: vacuum 멈춤. - **Replica hot_standby_feedback on**: primary 의 bloat. - **No partition on log table**: 큰 delete 가 cost. - **Heavy update no fillfactor**: HOT 깨짐. - **No monitoring**: silent. ## 🤖 LLM 활용 힌트 - Autovacuum tuning 가 first lever. - pg_repack 가 production 친화 재구성. - Long transaction 가 hidden cause. - Partition + drop 가 large data 의 답. ## 🔗 관련 문서 - [[DB_Postgres_EXPLAIN]] - [[DB_Lock_Analysis]] - [[DB_Replica_Operations]]