---
id: cs-wal-write-ahead-log
title: WAL (Write-Ahead Log) — Durability / Recovery
category: Coding
status: draft
source_trust_level: B
verification_status: conceptual
created_at: 2026-05-09
updated_at: 2026-05-09
tags: [cs, wal, database, durability, vibe-coding]
tech_stack: { language: "Concept", applicable_to: ["Database"] }
applied_in: []
aliases: [WAL, write-ahead log, journal, redo log, transaction log, checkpoint]
---

# WAL (Write-Ahead Log)

> Crash 후 복구 / replication 의 기반. **변경을 disk 에 먼저 log → 그 후 apply**. ACID 의 Durability 보장. Postgres / MySQL InnoDB / SQLite WAL.

## 📖 핵심 개념
- WAL: append-only log of changes.
- Commit: WAL flush = durable.
- Checkpoint: log → data file 적용.
- Recovery: crash 후 WAL replay.

## 💻 코드 패턴

### Postgres WAL
```
1. Transaction 가 변경 — buffer cache 에 적용 (fast).
2. WAL record 만들고 WAL buffer 에.
3. Commit 시 WAL fsync (durable).
4. Background 가 buffer → data file (lazy).
5. Checkpoint 가 모든 변경 flush + WAL 일부 retire.
```

### Recovery
```
Crash 후:
1. 마지막 checkpoint 부터 시작.
2. WAL replay (REDO).
3. Uncommitted transaction = abort (UNDO 옛 system).
4. Database 가 일관 state.
```

### Postgres 설정
```ini
# postgresql.conf
wal_level = replica       # replica / logical
synchronous_commit = on   # commit 가 fsync 까지 wait
fsync = on
full_page_writes = on
checkpoint_timeout = 5min
max_wal_size = 1GB
min_wal_size = 80MB
```

### Replication 의 기반
```
Streaming replication = WAL stream.
Primary 가 WAL → Standby.
Standby 가 WAL replay.

→ Hot standby 가 read 가능.
```

→ [[DB_Read_Replica_Patterns]] / [[DB_Replica_Operations]].

### Logical decoding
```sql
SELECT * FROM pg_create_logical_replication_slot('my_slot', 'pgoutput');

-- Subscribe (다른 system, e.g. Debezium)
-- WAL → row-level changes (INSERT / UPDATE / DELETE)
```

→ CDC 의 기반. [[DB_Change_Data_Capture]].

### MySQL InnoDB redo log
```ini
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 1   # 1=fsync per commit, 0=once/sec, 2=os flush
sync_binlog = 1                       # binlog fsync
```

### SQLite WAL mode
```sql
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
```

→ [[DB_SQLite_Patterns]].

### Performance tradeoff
```
Synchronous commit:
  on:    Durable. Latency ↑ (fsync per commit).
  off:   Fast. 마지막 < 1초 commit 잃을 수 있음.
  remote_apply: Replica 도 적용 후. Highest durability.
  remote_write:  Replica 가 받은 후.

→ 보통 'on'. 대량 batch + non-critical = off OK.
```

```sql
-- Per-transaction
SET synchronous_commit = off;
INSERT INTO log VALUES (...);
COMMIT;
SET synchronous_commit = on;
```

### Group commit
```
여러 commit 가 같이 fsync.
높은 throughput 시 자동.
```

### Checkpoint tuning
```ini
checkpoint_timeout = 15min        # 자주 = WAL 작고, 자주 IO
checkpoint_completion_target = 0.9 # 점진 — IO smooth
max_wal_size = 4GB                # 큰 = checkpoint 적게
```

→ Long checkpoint = recovery 길음.

### Archive log (PITR)
```ini
wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp %p s3://wal-archive/%f'
```

```bash
# 복구 — 특정 시점
restore_command = 'aws s3 cp s3://wal-archive/%f %p'
recovery_target_time = '2026-05-09 14:00:00'
```

→ Point-in-time recovery.

### WAL 크기 모니터링
```sql
SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')) AS total_wal;

-- Replication slot 의 WAL retention
SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag
FROM pg_replication_slots;
```

→ Slot 가 inactive + WAL 무한 누적 = disk full.

### Inactive slot (위험)
```sql
-- 옛 slot drop
SELECT pg_drop_replication_slot('unused_slot');
```

### App 영향
```
1. Commit latency = fsync 시간 (~1ms SSD, ~10ms HDD).
2. Long-running write transaction = WAL 큰 — replica lag.
3. Bulk insert = COPY + 적은 commit = 빠름.
4. Synchronous_commit off = 마지막 < 1s 잃을 수 있음 (banking 안 됨, log OK).
```

### COPY (bulk)
```sql
COPY orders FROM '/data.csv' CSV;
-- 매 row WAL — but 작은 overhead per row
-- 또한 commit 한 번 = fsync 한 번
```

```ts
import { from as copyFrom } from 'pg-copy-streams';
const stream = client.query(copyFrom('COPY orders (col1, col2) FROM STDIN CSV'));
fs.createReadStream('data.csv').pipe(stream);
```

### Unlogged tables (no WAL)
```sql
CREATE UNLOGGED TABLE temp_data (...);
-- WAL 안 — 빠름
-- Crash 시 truncated
```

→ 임시 / cache / 작업 table.

### 다른 storage engine
```
Postgres / MySQL InnoDB / SQL Server: redo log + WAL.
Cassandra: commit log.
RocksDB: WAL + memtable.
SQLite: rollback journal 또는 WAL.
File system (ext4, ZFS): journaling — 같은 idea.
```

### App-level WAL 패턴 (custom)
```
Event sourcing = app-level WAL.
Outbox = transactional log.

→ 같은 idea 다른 layer.
```

→ [[Backend_Event_Sourcing]] / [[Backend_Outbox_Pattern]].

### fsync 의 비용
```
HDD seek: ~10ms
SSD: ~0.1-1ms
Network FS / EBS: ~1-5ms (variable)

Group commit + WAL batching = 100s commit / sec OK.
```

### Crash recovery 시간
```
WAL 큼 = recovery 길음.
Checkpoint 자주 = recovery 짧음 (WAL 작음).

→ Trade-off.
```

### Backup + WAL
```bash
# pg_basebackup
pg_basebackup -D /backup -Ft -X stream -P

# 또는 file-system snapshot + WAL archive
# Recovery: snapshot 복원 + WAL replay
```

## 🤔 의사결정 기준
| 요구 | 설정 |
|---|---|
| Durability strict | synchronous_commit = on |
| 빠른 bulk insert | unlogged table 또는 sync off |
| Replication | WAL archive + slot |
| PITR | WAL archive |
| Edge / embedded | SQLite WAL |
| 매우 큰 throughput | Group commit + tune checkpoint |

## ❌ 안티패턴
- **fsync off prod**: durability 깨짐.
- **Replica slot drop 안 함**: WAL 무한 누적.
- **Checkpoint 너무 자주 (1min)**: IO 폭발.
- **Long transaction**: WAL 거대.
- **Backup 없는 archive only**: WAL 만으로는 복구 불가.
- **Async commit + critical**: 데이터 잃음.
- **HDD prod + sync commit**: 큰 latency.

## 🤖 LLM 활용 힌트
- WAL = ACID Durability + replication + recovery 기반.
- Postgres + WAL archive + 정기 backup.
- synchronous_commit = on (default).
- Slot 모니터링 + 옛 slot drop.

## 🔗 관련 문서
- [[DB_Vacuum_Autovacuum]]
- [[DB_Replica_Operations]]
- [[DevOps_Disaster_Recovery]]