Files
2nd/10_Wiki/Topics/Coding/CS_WAL_Write_Ahead_Log.md
T
2026-05-09 21:08:02 +09:00

6.3 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
cs-wal-write-ahead-log WAL (Write-Ahead Log) — Durability / Recovery Coding draft B conceptual 2026-05-09 2026-05-09
cs
wal
database
durability
vibe-coding
language applicable_to
Concept
Database
WAL
write-ahead log
journal
redo log
transaction log
checkpoint

WAL (Write-Ahead Log)

Crash 후 복구 / replication 의 기반. 변경을 disk 에 먼저 log → 그 후 apply. ACID 의 Durability 보장. Postgres / MySQL InnoDB / SQLite WAL.

📖 핵심 개념

  • WAL: append-only log of changes.
  • Commit: WAL flush = durable.
  • Checkpoint: log → data file 적용.
  • Recovery: crash 후 WAL replay.

💻 코드 패턴

Postgres WAL

1. Transaction 가 변경 — buffer cache 에 적용 (fast).
2. WAL record 만들고 WAL buffer 에.
3. Commit 시 WAL fsync (durable).
4. Background 가 buffer → data file (lazy).
5. Checkpoint 가 모든 변경 flush + WAL 일부 retire.

Recovery

Crash 후:
1. 마지막 checkpoint 부터 시작.
2. WAL replay (REDO).
3. Uncommitted transaction = abort (UNDO 옛 system).
4. Database 가 일관 state.

Postgres 설정

# postgresql.conf
wal_level = replica       # replica / logical
synchronous_commit = on   # commit 가 fsync 까지 wait
fsync = on
full_page_writes = on
checkpoint_timeout = 5min
max_wal_size = 1GB
min_wal_size = 80MB

Replication 의 기반

Streaming replication = WAL stream.
Primary 가 WAL → Standby.
Standby 가 WAL replay.

→ Hot standby 가 read 가능.

DB_Read_Replica_Patterns / DB_Replica_Operations.

Logical decoding

SELECT * FROM pg_create_logical_replication_slot('my_slot', 'pgoutput');

-- Subscribe (다른 system, e.g. Debezium)
-- WAL → row-level changes (INSERT / UPDATE / DELETE)

→ CDC 의 기반. DB_Change_Data_Capture.

MySQL InnoDB redo log

innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 1   # 1=fsync per commit, 0=once/sec, 2=os flush
sync_binlog = 1                       # binlog fsync

SQLite WAL mode

PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;

DB_SQLite_Patterns.

Performance tradeoff

Synchronous commit:
  on:    Durable. Latency ↑ (fsync per commit).
  off:   Fast. 마지막 < 1초 commit 잃을 수 있음.
  remote_apply: Replica 도 적용 후. Highest durability.
  remote_write:  Replica 가 받은 후.

→ 보통 'on'. 대량 batch + non-critical = off OK.
-- Per-transaction
SET synchronous_commit = off;
INSERT INTO log VALUES (...);
COMMIT;
SET synchronous_commit = on;

Group commit

여러 commit 가 같이 fsync.
높은 throughput 시 자동.

Checkpoint tuning

checkpoint_timeout = 15min        # 자주 = WAL 작고, 자주 IO
checkpoint_completion_target = 0.9 # 점진 — IO smooth
max_wal_size = 4GB                # 큰 = checkpoint 적게

→ Long checkpoint = recovery 길음.

Archive log (PITR)

wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp %p s3://wal-archive/%f'
# 복구 — 특정 시점
restore_command = 'aws s3 cp s3://wal-archive/%f %p'
recovery_target_time = '2026-05-09 14:00:00'

→ Point-in-time recovery.

WAL 크기 모니터링

SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')) AS total_wal;

-- Replication slot 의 WAL retention
SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag
FROM pg_replication_slots;

→ Slot 가 inactive + WAL 무한 누적 = disk full.

Inactive slot (위험)

-- 옛 slot drop
SELECT pg_drop_replication_slot('unused_slot');

App 영향

1. Commit latency = fsync 시간 (~1ms SSD, ~10ms HDD).
2. Long-running write transaction = WAL 큰 — replica lag.
3. Bulk insert = COPY + 적은 commit = 빠름.
4. Synchronous_commit off = 마지막 < 1s 잃을 수 있음 (banking 안 됨, log OK).

COPY (bulk)

COPY orders FROM '/data.csv' CSV;
-- 매 row WAL — but 작은 overhead per row
-- 또한 commit 한 번 = fsync 한 번
import { from as copyFrom } from 'pg-copy-streams';
const stream = client.query(copyFrom('COPY orders (col1, col2) FROM STDIN CSV'));
fs.createReadStream('data.csv').pipe(stream);

Unlogged tables (no WAL)

CREATE UNLOGGED TABLE temp_data (...);
-- WAL 안 — 빠름
-- Crash 시 truncated

→ 임시 / cache / 작업 table.

다른 storage engine

Postgres / MySQL InnoDB / SQL Server: redo log + WAL.
Cassandra: commit log.
RocksDB: WAL + memtable.
SQLite: rollback journal 또는 WAL.
File system (ext4, ZFS): journaling — 같은 idea.

App-level WAL 패턴 (custom)

Event sourcing = app-level WAL.
Outbox = transactional log.

→ 같은 idea 다른 layer.

Backend_Event_Sourcing / Backend_Outbox_Pattern.

fsync 의 비용

HDD seek: ~10ms
SSD: ~0.1-1ms
Network FS / EBS: ~1-5ms (variable)

Group commit + WAL batching = 100s commit / sec OK.

Crash recovery 시간

WAL 큼 = recovery 길음.
Checkpoint 자주 = recovery 짧음 (WAL 작음).

→ Trade-off.

Backup + WAL

# pg_basebackup
pg_basebackup -D /backup -Ft -X stream -P

# 또는 file-system snapshot + WAL archive
# Recovery: snapshot 복원 + WAL replay

🤔 의사결정 기준

요구 설정
Durability strict synchronous_commit = on
빠른 bulk insert unlogged table 또는 sync off
Replication WAL archive + slot
PITR WAL archive
Edge / embedded SQLite WAL
매우 큰 throughput Group commit + tune checkpoint

안티패턴

  • fsync off prod: durability 깨짐.
  • Replica slot drop 안 함: WAL 무한 누적.
  • Checkpoint 너무 자주 (1min): IO 폭발.
  • Long transaction: WAL 거대.
  • Backup 없는 archive only: WAL 만으로는 복구 불가.
  • Async commit + critical: 데이터 잃음.
  • HDD prod + sync commit: 큰 latency.

🤖 LLM 활용 힌트

  • WAL = ACID Durability + replication + recovery 기반.
  • Postgres + WAL archive + 정기 backup.
  • synchronous_commit = on (default).
  • Slot 모니터링 + 옛 slot drop.

🔗 관련 문서