f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
177 lines
6.2 KiB
Markdown
177 lines
6.2 KiB
Markdown
---
|
|
id: wiki-2026-0508-append-only-log
|
|
title: Append-only Log
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [Commit log, WAL, Event log, Immutable log]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [log, kafka, event-sourcing, wal, storage]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: java
|
|
framework: Kafka, Pulsar, Postgres WAL
|
|
---
|
|
|
|
# Append-only Log
|
|
|
|
## 매 한 줄
|
|
> **"매 sequence of immutable events — write once, read many"**. 매 database WAL (1980s) → distributed (LinkedIn Kafka 2011) → event sourcing 의 backbone. 매 2026 modern stack 은 Kafka 3.7 (KRaft, no ZK) / Redpanda (Raft, C++) / Pulsar 3.x (BookKeeper) / Postgres logical replication / WarpStream (S3-backed Kafka).
|
|
|
|
## 매 핵심
|
|
|
|
### 매 properties
|
|
- **Append-only**: 매 mutation 의 forbid. 매 corrections via new compensating event.
|
|
- **Ordered**: monotonic offset/sequence per partition.
|
|
- **Durable**: fsync, replicated (typical RF=3, ack=all).
|
|
- **Replayable**: consumers re-read from any offset.
|
|
- **Retention**: time-based, size-based, or compaction (key-based latest).
|
|
|
|
### 매 use cases
|
|
- **Database WAL** — Postgres pg_wal, MySQL binlog. Crash recovery.
|
|
- **Event sourcing** — domain events as source of truth, projections rebuild state.
|
|
- **CDC** — Debezium reads DB log → Kafka → consumers.
|
|
- **Stream processing** — Flink/Kafka Streams stateful aggregations.
|
|
- **Audit log** — tamper-evident with hash chain.
|
|
|
|
### 매 응용
|
|
1. **Kafka topic** — 7-day retention, multi-consumer fan-out.
|
|
2. **Event-sourced aggregate** — order state from order_events.
|
|
3. **Outbox pattern** — DB transaction + log entry → reliable event publish.
|
|
4. **Time-travel debugging** — replay from offset N.
|
|
|
|
## 💻 패턴
|
|
|
|
### Kafka producer (idempotent + transactional)
|
|
```java
|
|
Properties p = new Properties();
|
|
p.put("bootstrap.servers", "broker:9092");
|
|
p.put("enable.idempotence", "true");
|
|
p.put("acks", "all");
|
|
p.put("transactional.id", "orders-producer-1");
|
|
KafkaProducer<String,String> prod = new KafkaProducer<>(p, new StringSer(), new StringSer());
|
|
prod.initTransactions();
|
|
|
|
prod.beginTransaction();
|
|
prod.send(new ProducerRecord<>("orders", orderId, json));
|
|
prod.send(new ProducerRecord<>("audit", orderId, audit));
|
|
prod.commitTransaction();
|
|
```
|
|
|
|
### Consumer (offset commit after process)
|
|
```java
|
|
KafkaConsumer<String,String> c = new KafkaConsumer<>(props);
|
|
c.subscribe(List.of("orders"));
|
|
while (true) {
|
|
ConsumerRecords<String,String> recs = c.poll(Duration.ofSeconds(1));
|
|
for (var r : recs) processOrder(r.value());
|
|
c.commitSync(); // at-least-once
|
|
}
|
|
```
|
|
|
|
### Event sourcing aggregate
|
|
```typescript
|
|
type OrderEvent =
|
|
| { type: "Created", id: string, items: Item[] }
|
|
| { type: "Paid", amount: number }
|
|
| { type: "Shipped", trackingId: string };
|
|
|
|
function applyEvent(state: Order, e: OrderEvent): Order {
|
|
switch (e.type) {
|
|
case "Created": return { ...state, id: e.id, items: e.items, status: "pending" };
|
|
case "Paid": return { ...state, status: "paid", paidAmount: e.amount };
|
|
case "Shipped": return { ...state, status: "shipped", tracking: e.trackingId };
|
|
}
|
|
}
|
|
|
|
const state = events.reduce(applyEvent, {} as Order);
|
|
```
|
|
|
|
### Outbox pattern (Postgres + Debezium)
|
|
```sql
|
|
BEGIN;
|
|
INSERT INTO orders(id, status) VALUES ('abc', 'pending');
|
|
INSERT INTO outbox(aggregate_id, event_type, payload)
|
|
VALUES ('abc', 'OrderCreated', '{"id":"abc",...}'::jsonb);
|
|
COMMIT;
|
|
-- Debezium tails pg_wal → publishes outbox row → Kafka 'orders' topic
|
|
```
|
|
|
|
### Log compaction (Kafka)
|
|
```bash
|
|
# Topic config: cleanup.policy=compact
|
|
# Same key keeps only latest value → materialize current state
|
|
kafka-configs.sh --alter --entity-type topics --entity-name user-profiles \
|
|
--add-config cleanup.policy=compact,min.cleanable.dirty.ratio=0.1
|
|
```
|
|
|
|
### Hash-chained audit log
|
|
```python
|
|
import hashlib, json
|
|
|
|
def append(prev_hash: str, event: dict) -> tuple[str, dict]:
|
|
record = {"prev": prev_hash, "event": event, "ts": time.time()}
|
|
h = hashlib.sha256(json.dumps(record, sort_keys=True).encode()).hexdigest()
|
|
return h, {**record, "hash": h}
|
|
|
|
# Tamper-evident: any modification breaks chain
|
|
```
|
|
|
|
### Postgres logical replication slot
|
|
```sql
|
|
SELECT pg_create_logical_replication_slot('app_slot', 'pgoutput');
|
|
-- Stream WAL changes to consumer (CDC)
|
|
SELECT * FROM pg_logical_slot_get_changes('app_slot', NULL, NULL);
|
|
```
|
|
|
|
### Snapshot + tail (event sourcing optimization)
|
|
```typescript
|
|
async function loadAggregate(id: string): Promise<Order> {
|
|
const snap = await snapStore.get(id); // periodic snapshot
|
|
const events = await eventStore.read(id, snap?.version ?? 0);
|
|
return events.reduce(applyEvent, snap?.state ?? {});
|
|
}
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | System |
|
|
|---|---|
|
|
| High-throughput streaming, multi-consumer | Kafka / Redpanda |
|
|
| Geo-replicated, tiered storage | Pulsar / WarpStream |
|
|
| Event-sourced single service | EventStoreDB / Postgres + outbox |
|
|
| Database CDC | Debezium → Kafka |
|
|
| Tamper-evident audit | Hash-chain + signed |
|
|
|
|
**기본값**: 매 Kafka (or Redpanda for ops simplicity) 매 distributed log, 매 Postgres WAL + outbox 매 single-service.
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[Distributed Systems]]
|
|
- 변형: [[Kafka]] · [[WAL]] · [[Event Store]]
|
|
- 응용: [[Event Sourcing]] · [[CDC]] · [[CQRS]]
|
|
- Adjacent: [[Stream-Processing-Architectures|Stream Processing]] · [[Idempotency]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: 매 audit/replay 요구, 매 multiple consumer/projection, 매 temporal queries, 매 reliable event publishing.
|
|
**언제 X**: 매 simple CRUD without history, 매 strong consistency snapshot only, 매 storage cost-sensitive (logs grow).
|
|
|
|
## ❌ 안티패턴
|
|
- **Mutating past events**: 매 invariant violation. 매 compensating event 의 emit.
|
|
- **Unbounded retention without compaction**: 매 storage explosion.
|
|
- **Synchronous replay on every read**: 매 latency. 매 snapshot + tail.
|
|
- **Single-partition Kafka topic**: 매 throughput cap. 매 partition by key.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (Jay Kreps "The Log" 2013, Kafka docs, Postgres WAL docs, Greg Young event sourcing).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — full content (Kafka, event sourcing, WAL, outbox) |
|