[G1-Sync] Manual knowledge update

2026-05-09 21:08:02 +09:00
parent f0befc887a
commit 93ec7e9056
363 changed files with 68333 additions and 64 deletions
@@ -0,0 +1,228 @@
+---
+id: cs-btree-lsm-storage
+title: B-Tree vs LSM-Tree — Storage 엔진
+category: Coding
+status: draft
+source_trust_level: B
+verification_status: conceptual
+created_at: 2026-05-09
+updated_at: 2026-05-09
+tags: [cs, storage, btree, lsm, vibe-coding]
+tech_stack: { language: "Concept", applicable_to: ["Database"] }
+applied_in: []
+aliases: [B-Tree, LSM-Tree, RocksDB, Postgres, MyISAM, write amplification, read amplification]
+---
+
+# B-Tree vs LSM-Tree
+
+> DB 의 두 storage engine. **B-Tree (Postgres / MySQL InnoDB) = read 빠름, in-place update**. **LSM-Tree (RocksDB / Cassandra / ScyllaDB) = write 빠름, append-only**. Trade-off: read amp / write amp / space amp.
+
+## 📖 핵심 개념
+- B-Tree: balanced tree, in-place update.
+- LSM: write → memtable → SSTable (immutable) → compaction.
+- Read amplification: 한 read 가 N file 검사.
+- Write amplification: 한 write 가 N 번 disk write.
+- Space amplification: 데이터 + 사본 / 압축 차이.
+
+## 💻 코드 패턴
+
+### B-Tree 동작
+```
+Read:    Root → branch → leaf.  log(N) seek.
+Write:   Page 직접 변경 (또는 WAL + page flush).
+Delete:  Page 안 mark, vacuum 으로 정리.
+
+장점: O(log N) read, range scan 빠름, mature.
+단점: Page split 비싸, 작은 random write 가 page 다시 write.
+```
+
+### LSM 동작
+```
+Write:
+1. Memtable (RAM, sorted) 에 추가
+2. Memtable 가득 → SSTable (sorted, immutable) 로 flush
+3. Compaction: 여러 SSTable → 합치기
+
+Read:
+1. Memtable 검사
+2. 각 level 의 SSTable 검사 (Bloom filter 가 skip)
+3. 가장 최신 version 반환
+
+Delete: tombstone 추가. Compaction 가 정리.
+```
+
+### Compaction strategy
+```
+Leveled (RocksDB):
+- Level N = N+1 의 ~10x 크기
+- 작은 read amp, 큰 write amp
+
+Tiered (Cassandra):
+- 같은 level 의 작은 SSTable 합치기
+- 작은 write amp, 큰 read amp
+
+Hybrid: ScyllaDB.
+```
+
+### B-Tree 의 page 구조
+```
+[ Page header | Key1 → Pointer1 | Key2 → Pointer2 | ... ]
+
+Page size: 보통 8KB (Postgres) / 16KB (MySQL).
+Fillfactor: 80% — UPDATE 위 free space 남김 (HOT update).
+```
+
+### LSM 의 SSTable
+```
+[ Header | Index | Bloom filter | Sorted key-value pairs | Footer ]
+
+Index = sparse (every Nth key).
+Bloom filter = 이 key 가 이 SSTable 에 없을지 빠른 검사.
+```
+
+### Write amplification 실측
+```
+Insert 1 byte → disk 에 N bytes write.
+
+B-Tree: 보통 2-10x (page write + WAL).
+LSM (leveled): 10-30x (compaction).
+LSM (tiered): 5-15x.
+```
+
+### Read amplification
+```
+Get key X →
+
+B-Tree: log(N) page (cache 가 보통 처리).
+LSM:    여러 level + memtable. Bloom 가 skip 도와줌.
+```
+
+### Space amplification
+```
+1GB 데이터 →
+
+B-Tree: 1GB + index. 1.5x.
+LSM:    1GB + 압축 + tombstone + 옛 version. 1.1-2x (compaction 정도).
+```
+
+### 적합 use case
+```
+B-Tree:
+- OLTP (random read + update + delete)
+- 일관된 read latency
+- Range query 자주
+- Postgres / MySQL / SQLite
+
+LSM:
+- Write-heavy (시계열, log)
+- 빠른 ingestion
+- Range scan 도 OK
+- Cassandra / RocksDB / LevelDB / DynamoDB / ScyllaDB
+```
+
+### Hybrid
+```
+Postgres + Heap + WAL: B-Tree 그러나 log-structured 측면.
+ZFS / Btrfs: copy-on-write file system — LSM 같은 측면.
+```
+
+### 튜닝 — Postgres B-Tree
+```sql
+-- Page fill factor (UPDATE-heavy)
+ALTER TABLE x SET (fillfactor = 80);
+
+-- Index fillfactor
+CREATE INDEX ON x (col) WITH (fillfactor = 90);
+
+-- Vacuum 자주 (bloat 방지)
+ALTER TABLE x SET (autovacuum_vacuum_scale_factor = 0.05);
+```
+
+### 튜닝 — RocksDB LSM
+```
+write_buffer_size:           Memtable 크기
+max_write_buffer_number:     동시 memtable
+level0_file_num_compaction_trigger
+target_file_size_base:       SSTable 크기
+compression_per_level:       각 level 의 압축
+bloom_filter_bits_per_key:   read 가속
+```
+
+### 사용 라이브러리 — Node
+```ts
+// LevelDB / RocksDB
+import { Level } from 'level';
+const db = new Level('./db', { valueEncoding: 'json' });
+await db.put('key', { value: 42 });
+const v = await db.get('key');
+
+// Range
+for await (const [k, v] of db.iterator({ gte: 'a', lte: 'z' })) {
+  console.log(k, v);
+}
+```
+
+### Sorted vs unsorted
+```
+B-Tree:  내장 sorted (by key).
+LSM:     sorted (by key) — range scan OK.
+Hash:    unsorted (no range, only point lookup) — Memcached, hash index.
+```
+
+### Cache hierarchy
+```
+RAM (page cache / memtable) → SSD (data) → 옛 SSD / HDD (cold).
+
+Postgres shared_buffers: 25% RAM 권장.
+RocksDB block_cache: workload 따라.
+```
+
+### 알고리즘 visualization
+```
+B-Tree insertion:
+1. Find leaf
+2. If full → split, push median up
+3. Recursive up
+
+LSM compaction:
+1. L0 file count > threshold → merge into L1
+2. L1 size > target → merge oldest into L2
+...
+```
+
+### Modern 변형
+```
+Fractal Tree: B-Tree + log buffer (TokuDB).
+Bw-Tree:      lock-free B-Tree 변형 (Hekaton, Microsoft).
+Adaptive Radix Tree (ART): 메모리 DB.
+LSM with bloom filters per level.
+```
+
+## 🤔 의사결정 기준
+| Workload | Engine |
+|---|---|
+| OLTP (banking, orders) | B-Tree (Postgres / InnoDB) |
+| Time-series / logs | LSM (Cassandra / TimescaleDB) |
+| Write-heavy + range | LSM (RocksDB) |
+| Mostly read | B-Tree |
+| Embedded | LevelDB / SQLite (B-Tree) |
+| Distributed write | LSM (Cassandra / ScyllaDB) |
+
+## ❌ 안티패턴
+- **B-Tree 큰 random insert**: page split 폭발. UUID v7.
+- **LSM short value frequent overwrite**: write amp 큼. 다른 storage.
+- **Compaction off LSM**: read amp 폭발.
+- **Vacuum off B-Tree**: bloat.
+- **Bloom filter off LSM**: read 매번 모든 SSTable.
+- **Cache size 무시**: 디스크 hit 자주.
+- **B-Tree 가정 + LSM DB 사용**: trade-off 모름.
+
+## 🤖 LLM 활용 힌트
+- Postgres / MySQL = B-Tree (대부분 case).
+- Cassandra / RocksDB = LSM (write-heavy).
+- 알고 쓰면 튜닝 정확.
+
+## 🔗 관련 문서
+- [[DB_Index_Strategy]]
+- [[DB_Vacuum_Autovacuum]]
+- [[DB_Time_Series_Patterns]]