f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.2 KiB
5.2 KiB
id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
| id | title | category | status | canonical_id | aliases | duplicate_of | source_trust_level | confidence_score | verification_status | tags | raw_sources | last_reinforced | github_commit | tech_stack | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wiki-2026-0508-mark-sweep | Mark-Sweep GC | 10_Wiki/Topics | verified | self |
|
none | A | 0.9 | applied |
|
2026-05-10 | pending |
|
Mark-Sweep GC
매 한 줄
"매 reachable 만 살리고 — 매 나머지는 sweep.". Mark-Sweep 은 매 1960년 McCarthy 가 Lisp 위해 발명한 매 tracing GC algorithm. 매 root 부터 reachable object 를 mark 하고 매 unmarked 를 sweep. 매 reference counting 의 cycle 문제 해결 — 매 modern GC (V8, JVM, CPython generational) 의 base.
매 핵심
매 2-phase
- Mark: 매 GC root (stack, globals, registers) 부터 매 graph traverse, 매 reachable 에 mark bit 1.
- Sweep: 매 heap 전체 scan, 매 unmarked object 를 free list 에 추가.
매 GC root
- 매 stack 의 local variable.
- 매 global / static variable.
- 매 CPU register.
- 매 JNI / native handle.
매 변형
- Mark-Compact: sweep 대신 매 살아있는 object 를 압축 → 매 fragmentation 해결.
- Tri-color: white / grey / black — 매 incremental / concurrent 가능.
- Generational: young (Eden + Survivor) / old — 매 most objects die young 가설.
매 stop-the-world (STW)
- 매 mark phase 동안 매 mutator (app thread) 정지 — 매 reference graph 일관성.
- 매 concurrent / incremental GC 는 매 write barrier 로 STW 최소화.
💻 패턴
Pseudocode mark
def mark(obj):
if obj is None or obj.marked:
return
obj.marked = True
for ref in obj.references:
mark(ref)
def gc():
# 1. Mark
for root in get_roots():
mark(root)
# 2. Sweep
for obj in heap:
if obj.marked:
obj.marked = False # reset for next cycle
else:
free(obj)
Tri-color iterative mark
WHITE, GREY, BLACK = 0, 1, 2
def mark_iterative():
# Initially all WHITE except roots → GREY
grey = list(get_roots())
for r in grey:
r.color = GREY
while grey:
obj = grey.pop()
for ref in obj.references:
if ref.color == WHITE:
ref.color = GREY
grey.append(ref)
obj.color = BLACK
# Sweep: WHITE → free, BLACK → reset to WHITE
for obj in heap:
if obj.color == WHITE:
free(obj)
else:
obj.color = WHITE
V8 (JavaScript) GC config
// Inspect V8 GC behavior
node --trace-gc app.js
// → [GC] Scavenge ... Mark-sweep ...
// Tune heap size
node --max-old-space-size=4096 app.js // 4GB
// Trigger GC manually (testing only)
node --expose-gc -e "global.gc()"
JVM G1GC vs ZGC
# G1GC (default since Java 9): regional mark-sweep with compaction
java -XX:+UseG1GC -Xmx8g -XX:MaxGCPauseMillis=200 App
# ZGC (Java 21+): concurrent, sub-ms pause
java -XX:+UseZGC -Xmx32g App
# → most work concurrent with mutator, STW phases <1ms
CPython gc module
import gc
# CPython uses refcount + mark-sweep (for cycles only)
gc.collect() # force cycle collection
gc.get_count() # (gen0, gen1, gen2) — generational
gc.set_threshold(700, 10, 10) # tune trigger
# Detect cycles
import objgraph
objgraph.show_most_common_types(limit=20)
Write barrier (concurrent GC)
// When mutator writes a pointer, GC must know
// (so concurrent mark doesn't miss new references)
void write_field(Object* obj, int idx, Object* val) {
if (gc_is_marking && obj->color == BLACK && val->color == WHITE) {
val->color = GREY;
grey_queue_push(val);
}
obj->fields[idx] = val;
}
매 결정 기준
| 상황 | GC choice |
|---|---|
| Throughput 중요, latency 무관 | Parallel GC (JVM) |
| 균형 (default) | G1GC (JVM), V8 default |
| 매 sub-ms pause | ZGC (JVM 21+), Shenandoah |
| Embedded / RT | Manual / arena allocator |
| Functional language | Generational copying (OCaml, Erlang per-process) |
기본값: 매 generational + concurrent mark-sweep (G1, V8 Orinoco) — 매 modern runtime 의 standard.
🔗 Graph
- 부모: Garbage Collection · Memory Management
- 변형: Reference Counting
- 응용: V8 Engine · JVM
- Adjacent: Write Barrier · Memory_Leaks
🤖 LLM 활용
언제: GC algorithm 설계, runtime 선택, GC tuning 의 이론적 basis 가 필요할 때. 언제 X: application-level memory leak 추적 — 매 Memory_Leaks 가 더 직접적.
❌ 안티패턴
- "GC = no leaks": 매 reachable leak 이 매 GC lang 의 가장 흔한 issue.
- 잦은 manual gc(): 매 runtime 의 heuristic 보다 못함, 매 throughput 떨어뜨림.
- 거대 heap: 매 mark phase 가 heap size 에 비례 — 매 큰 heap = 큰 pause (non-concurrent GC).
- Finalizer 의존: 매 unpredictable timing — 매 RAII / try-with-resources 가 정답.
🧪 검증 / 중복
- Verified (McCarthy 1960, "GC Handbook" Jones et al, V8 blog, OpenJDK docs).
- 신뢰도 A.
🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — 2-phase + tri-color + V8/JVM/CPython 패턴 |