6.3 KiB
6.3 KiB
id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
| id | title | category | status | source_trust_level | verification_status | created_at | updated_at | tags | tech_stack | applied_in | aliases | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cs-wal-write-ahead-log | WAL (Write-Ahead Log) — Durability / Recovery | Coding | draft | B | conceptual | 2026-05-09 | 2026-05-09 |
|
|
|
WAL (Write-Ahead Log)
Crash 후 복구 / replication 의 기반. 변경을 disk 에 먼저 log → 그 후 apply. ACID 의 Durability 보장. Postgres / MySQL InnoDB / SQLite WAL.
📖 핵심 개념
- WAL: append-only log of changes.
- Commit: WAL flush = durable.
- Checkpoint: log → data file 적용.
- Recovery: crash 후 WAL replay.
💻 코드 패턴
Postgres WAL
1. Transaction 가 변경 — buffer cache 에 적용 (fast).
2. WAL record 만들고 WAL buffer 에.
3. Commit 시 WAL fsync (durable).
4. Background 가 buffer → data file (lazy).
5. Checkpoint 가 모든 변경 flush + WAL 일부 retire.
Recovery
Crash 후:
1. 마지막 checkpoint 부터 시작.
2. WAL replay (REDO).
3. Uncommitted transaction = abort (UNDO 옛 system).
4. Database 가 일관 state.
Postgres 설정
# postgresql.conf
wal_level = replica # replica / logical
synchronous_commit = on # commit 가 fsync 까지 wait
fsync = on
full_page_writes = on
checkpoint_timeout = 5min
max_wal_size = 1GB
min_wal_size = 80MB
Replication 의 기반
Streaming replication = WAL stream.
Primary 가 WAL → Standby.
Standby 가 WAL replay.
→ Hot standby 가 read 가능.
→ DB_Read_Replica_Patterns / DB_Replica_Operations.
Logical decoding
SELECT * FROM pg_create_logical_replication_slot('my_slot', 'pgoutput');
-- Subscribe (다른 system, e.g. Debezium)
-- WAL → row-level changes (INSERT / UPDATE / DELETE)
→ CDC 의 기반. DB_Change_Data_Capture.
MySQL InnoDB redo log
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 1 # 1=fsync per commit, 0=once/sec, 2=os flush
sync_binlog = 1 # binlog fsync
SQLite WAL mode
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
Performance tradeoff
Synchronous commit:
on: Durable. Latency ↑ (fsync per commit).
off: Fast. 마지막 < 1초 commit 잃을 수 있음.
remote_apply: Replica 도 적용 후. Highest durability.
remote_write: Replica 가 받은 후.
→ 보통 'on'. 대량 batch + non-critical = off OK.
-- Per-transaction
SET synchronous_commit = off;
INSERT INTO log VALUES (...);
COMMIT;
SET synchronous_commit = on;
Group commit
여러 commit 가 같이 fsync.
높은 throughput 시 자동.
Checkpoint tuning
checkpoint_timeout = 15min # 자주 = WAL 작고, 자주 IO
checkpoint_completion_target = 0.9 # 점진 — IO smooth
max_wal_size = 4GB # 큰 = checkpoint 적게
→ Long checkpoint = recovery 길음.
Archive log (PITR)
wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp %p s3://wal-archive/%f'
# 복구 — 특정 시점
restore_command = 'aws s3 cp s3://wal-archive/%f %p'
recovery_target_time = '2026-05-09 14:00:00'
→ Point-in-time recovery.
WAL 크기 모니터링
SELECT pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0')) AS total_wal;
-- Replication slot 의 WAL retention
SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag
FROM pg_replication_slots;
→ Slot 가 inactive + WAL 무한 누적 = disk full.
Inactive slot (위험)
-- 옛 slot drop
SELECT pg_drop_replication_slot('unused_slot');
App 영향
1. Commit latency = fsync 시간 (~1ms SSD, ~10ms HDD).
2. Long-running write transaction = WAL 큰 — replica lag.
3. Bulk insert = COPY + 적은 commit = 빠름.
4. Synchronous_commit off = 마지막 < 1s 잃을 수 있음 (banking 안 됨, log OK).
COPY (bulk)
COPY orders FROM '/data.csv' CSV;
-- 매 row WAL — but 작은 overhead per row
-- 또한 commit 한 번 = fsync 한 번
import { from as copyFrom } from 'pg-copy-streams';
const stream = client.query(copyFrom('COPY orders (col1, col2) FROM STDIN CSV'));
fs.createReadStream('data.csv').pipe(stream);
Unlogged tables (no WAL)
CREATE UNLOGGED TABLE temp_data (...);
-- WAL 안 — 빠름
-- Crash 시 truncated
→ 임시 / cache / 작업 table.
다른 storage engine
Postgres / MySQL InnoDB / SQL Server: redo log + WAL.
Cassandra: commit log.
RocksDB: WAL + memtable.
SQLite: rollback journal 또는 WAL.
File system (ext4, ZFS): journaling — 같은 idea.
App-level WAL 패턴 (custom)
Event sourcing = app-level WAL.
Outbox = transactional log.
→ 같은 idea 다른 layer.
→ Backend_Event_Sourcing / Backend_Outbox_Pattern.
fsync 의 비용
HDD seek: ~10ms
SSD: ~0.1-1ms
Network FS / EBS: ~1-5ms (variable)
Group commit + WAL batching = 100s commit / sec OK.
Crash recovery 시간
WAL 큼 = recovery 길음.
Checkpoint 자주 = recovery 짧음 (WAL 작음).
→ Trade-off.
Backup + WAL
# pg_basebackup
pg_basebackup -D /backup -Ft -X stream -P
# 또는 file-system snapshot + WAL archive
# Recovery: snapshot 복원 + WAL replay
🤔 의사결정 기준
| 요구 | 설정 |
|---|---|
| Durability strict | synchronous_commit = on |
| 빠른 bulk insert | unlogged table 또는 sync off |
| Replication | WAL archive + slot |
| PITR | WAL archive |
| Edge / embedded | SQLite WAL |
| 매우 큰 throughput | Group commit + tune checkpoint |
❌ 안티패턴
- fsync off prod: durability 깨짐.
- Replica slot drop 안 함: WAL 무한 누적.
- Checkpoint 너무 자주 (1min): IO 폭발.
- Long transaction: WAL 거대.
- Backup 없는 archive only: WAL 만으로는 복구 불가.
- Async commit + critical: 데이터 잃음.
- HDD prod + sync commit: 큰 latency.
🤖 LLM 활용 힌트
- WAL = ACID Durability + replication + recovery 기반.
- Postgres + WAL archive + 정기 backup.
- synchronous_commit = on (default).
- Slot 모니터링 + 옛 slot drop.