--- id: cs-wal-write-ahead-log title: WAL (Write-Ahead Log) — Durability / Recovery category: Coding status: draft source_trust_level: B verification_status: conceptual created_at: 2026-05-09 updated_at: 2026-05-09 tags: [cs, wal, database, durability, vibe-coding] tech_stack: { language: "Concept", applicable_to: ["Database"] } applied_in: [] aliases: [WAL, write-ahead log, journal, redo log, transaction log, checkpoint] --- # WAL (Write-Ahead Log) > Crash 후 복구 / replication 의 기반. **변경을 disk 에 먼저 log → 그 후 apply**. ACID 의 Durability 보장. Postgres / MySQL InnoDB / SQLite WAL. ## 📖 핵심 개념 - WAL: append-only log of changes. - Commit: WAL flush = durable. - Checkpoint: log → data file 적용. - Recovery: crash 후 WAL replay. ## 💻 코드 패턴 ### Postgres WAL ``` 1. Transaction 가 변경 — buffer cache 에 적용 (fast). 2. WAL record 만들고 WAL buffer 에. 3. Commit 시 WAL fsync (durable). 4. Background 가 buffer → data file (lazy). 5. Checkpoint 가 모든 변경 flush + WAL 일부 retire. ``` ### Recovery ``` Crash 후: 1. 마지막 checkpoint 부터 시작. 2. WAL replay (REDO). 3. Uncommitted transaction = abort (UNDO 옛 system). 4. Database 가 일관 state. ``` ### Postgres 설정 ```ini # postgresql.conf wal_level = replica # replica / logical synchronous_commit = on # commit 가 fsync 까지 wait fsync = on full_page_writes = on checkpoint_timeout = 5min max_wal_size = 1GB min_wal_size = 80MB ``` ### Replication 의 기반 ``` Streaming replication = WAL stream. Primary 가 WAL → Standby. Standby 가 WAL replay. → Hot standby 가 read 가능. ``` → [[DB_Read_Replica_Patterns]] / [[DB_Replica_Operations]]. ### Logical decoding ```sql SELECT * FROM pg_create_logical_replication_slot('my_slot', 'pgoutput'); -- Subscribe (다른 system, e.g. Debezium) -- WAL → row-level changes (INSERT / UPDATE / DELETE) ``` → CDC 의 기반. [[DB_Change_Data_Capture]]. ### MySQL InnoDB redo log ```ini innodb_log_file_size = 1G innodb_flush_log_at_trx_commit = 1 # 1=fsync per commit, 0=once/sec, 2=os flush sync_binlog = 1 # binlog fsync ``` ### SQLite WAL mode ```sql PRAGMA journal_mode = WAL; PRAGMA synchronous = NORMAL; ``` → [[DB_SQLite_Patterns]]. ### Performance tradeoff ``` Synchronous commit: on: Durable. Latency ↑ (fsync per commit). off: Fast. 마지막 < 1초 commit 잃을 수 있음. remote_apply: Replica 도 적용 후. Highest durability. remote_write: Replica 가 받은 후. → 보통 'on'. 대량 batch + non-critical = off OK. ``` ```sql -- Per-transaction SET synchronous_commit = off; INSERT INTO log VALUES (...); COMMIT; SET synchronous_commit = on; ``` ### Group commit ``` 여러 commit 가 같이 fsync. 높은 throughput 시 자동. ``` ### Checkpoint tuning ```ini checkpoint_timeout = 15min # 자주 = WAL 작고, 자주 IO checkpoint_completion_target = 0.9 # 점진 — IO smooth max_wal_size = 4GB # 큰 = checkpoint 적게 ``` → Long checkpoint = recovery 길음. ### Archive log (PITR) ```ini wal_level = replica archive_mode = on archive_command = 'aws s3 cp %p s3://wal-archive/%f' ``` ```bash # 복구 — 특정 시점 restore_command = 'aws s3 cp s3://wal-archive/%f %p' recovery_target_time = '2026-05-09 14:00:00' ``` → Point-in-time recovery. ### WAL 크기 모니터링 ```sql SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')) AS total_wal; -- Replication slot 의 WAL retention SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag FROM pg_replication_slots; ``` → Slot 가 inactive + WAL 무한 누적 = disk full. ### Inactive slot (위험) ```sql -- 옛 slot drop SELECT pg_drop_replication_slot('unused_slot'); ``` ### App 영향 ``` 1. Commit latency = fsync 시간 (~1ms SSD, ~10ms HDD). 2. Long-running write transaction = WAL 큰 — replica lag. 3. Bulk insert = COPY + 적은 commit = 빠름. 4. Synchronous_commit off = 마지막 < 1s 잃을 수 있음 (banking 안 됨, log OK). ``` ### COPY (bulk) ```sql COPY orders FROM '/data.csv' CSV; -- 매 row WAL — but 작은 overhead per row -- 또한 commit 한 번 = fsync 한 번 ``` ```ts import { from as copyFrom } from 'pg-copy-streams'; const stream = client.query(copyFrom('COPY orders (col1, col2) FROM STDIN CSV')); fs.createReadStream('data.csv').pipe(stream); ``` ### Unlogged tables (no WAL) ```sql CREATE UNLOGGED TABLE temp_data (...); -- WAL 안 — 빠름 -- Crash 시 truncated ``` → 임시 / cache / 작업 table. ### 다른 storage engine ``` Postgres / MySQL InnoDB / SQL Server: redo log + WAL. Cassandra: commit log. RocksDB: WAL + memtable. SQLite: rollback journal 또는 WAL. File system (ext4, ZFS): journaling — 같은 idea. ``` ### App-level WAL 패턴 (custom) ``` Event sourcing = app-level WAL. Outbox = transactional log. → 같은 idea 다른 layer. ``` → [[Backend_Event_Sourcing]] / [[Backend_Outbox_Pattern]]. ### fsync 의 비용 ``` HDD seek: ~10ms SSD: ~0.1-1ms Network FS / EBS: ~1-5ms (variable) Group commit + WAL batching = 100s commit / sec OK. ``` ### Crash recovery 시간 ``` WAL 큼 = recovery 길음. Checkpoint 자주 = recovery 짧음 (WAL 작음). → Trade-off. ``` ### Backup + WAL ```bash # pg_basebackup pg_basebackup -D /backup -Ft -X stream -P # 또는 file-system snapshot + WAL archive # Recovery: snapshot 복원 + WAL replay ``` ## 🤔 의사결정 기준 | 요구 | 설정 | |---|---| | Durability strict | synchronous_commit = on | | 빠른 bulk insert | unlogged table 또는 sync off | | Replication | WAL archive + slot | | PITR | WAL archive | | Edge / embedded | SQLite WAL | | 매우 큰 throughput | Group commit + tune checkpoint | ## ❌ 안티패턴 - **fsync off prod**: durability 깨짐. - **Replica slot drop 안 함**: WAL 무한 누적. - **Checkpoint 너무 자주 (1min)**: IO 폭발. - **Long transaction**: WAL 거대. - **Backup 없는 archive only**: WAL 만으로는 복구 불가. - **Async commit + critical**: 데이터 잃음. - **HDD prod + sync commit**: 큰 latency. ## 🤖 LLM 활용 힌트 - WAL = ACID Durability + replication + recovery 기반. - Postgres + WAL archive + 정기 backup. - synchronous_commit = on (default). - Slot 모니터링 + 옛 slot drop. ## 🔗 관련 문서 - [[DB_Vacuum_Autovacuum]] - [[DB_Replica_Operations]] - [[DevOps_Disaster_Recovery]]