Files
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

182 lines
5.2 KiB
Markdown

---
id: wiki-2026-0508-mark-sweep
title: Mark-Sweep GC
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Mark and Sweep, Tracing GC, M&S]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [gc, memory, runtime, algorithm]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: multi
framework: jvm-v8-cpython
---
# Mark-Sweep GC
## 매 한 줄
> **"매 reachable 만 살리고 — 매 나머지는 sweep."**. Mark-Sweep 은 매 1960년 McCarthy 가 Lisp 위해 발명한 매 tracing GC algorithm. 매 root 부터 reachable object 를 mark 하고 매 unmarked 를 sweep. 매 reference counting 의 cycle 문제 해결 — 매 modern GC (V8, JVM, CPython generational) 의 base.
## 매 핵심
### 매 2-phase
1. **Mark**: 매 GC root (stack, globals, registers) 부터 매 graph traverse, 매 reachable 에 mark bit 1.
2. **Sweep**: 매 heap 전체 scan, 매 unmarked object 를 free list 에 추가.
### 매 GC root
- 매 stack 의 local variable.
- 매 global / static variable.
- 매 CPU register.
- 매 JNI / native handle.
### 매 변형
- **Mark-Compact**: sweep 대신 매 살아있는 object 를 압축 → 매 fragmentation 해결.
- **Tri-color**: white / grey / black — 매 incremental / concurrent 가능.
- **Generational**: young (Eden + Survivor) / old — 매 most objects die young 가설.
### 매 stop-the-world (STW)
- 매 mark phase 동안 매 mutator (app thread) 정지 — 매 reference graph 일관성.
- 매 concurrent / incremental GC 는 매 write barrier 로 STW 최소화.
## 💻 패턴
### Pseudocode mark
```python
def mark(obj):
if obj is None or obj.marked:
return
obj.marked = True
for ref in obj.references:
mark(ref)
def gc():
# 1. Mark
for root in get_roots():
mark(root)
# 2. Sweep
for obj in heap:
if obj.marked:
obj.marked = False # reset for next cycle
else:
free(obj)
```
### Tri-color iterative mark
```python
WHITE, GREY, BLACK = 0, 1, 2
def mark_iterative():
# Initially all WHITE except roots → GREY
grey = list(get_roots())
for r in grey:
r.color = GREY
while grey:
obj = grey.pop()
for ref in obj.references:
if ref.color == WHITE:
ref.color = GREY
grey.append(ref)
obj.color = BLACK
# Sweep: WHITE → free, BLACK → reset to WHITE
for obj in heap:
if obj.color == WHITE:
free(obj)
else:
obj.color = WHITE
```
### V8 (JavaScript) GC config
```javascript
// Inspect V8 GC behavior
node --trace-gc app.js
// → [GC] Scavenge ... Mark-sweep ...
// Tune heap size
node --max-old-space-size=4096 app.js // 4GB
// Trigger GC manually (testing only)
node --expose-gc -e "global.gc()"
```
### JVM G1GC vs ZGC
```bash
# G1GC (default since Java 9): regional mark-sweep with compaction
java -XX:+UseG1GC -Xmx8g -XX:MaxGCPauseMillis=200 App
# ZGC (Java 21+): concurrent, sub-ms pause
java -XX:+UseZGC -Xmx32g App
# → most work concurrent with mutator, STW phases <1ms
```
### CPython gc module
```python
import gc
# CPython uses refcount + mark-sweep (for cycles only)
gc.collect() # force cycle collection
gc.get_count() # (gen0, gen1, gen2) — generational
gc.set_threshold(700, 10, 10) # tune trigger
# Detect cycles
import objgraph
objgraph.show_most_common_types(limit=20)
```
### Write barrier (concurrent GC)
```c
// When mutator writes a pointer, GC must know
// (so concurrent mark doesn't miss new references)
void write_field(Object* obj, int idx, Object* val) {
if (gc_is_marking && obj->color == BLACK && val->color == WHITE) {
val->color = GREY;
grey_queue_push(val);
}
obj->fields[idx] = val;
}
```
## 매 결정 기준
| 상황 | GC choice |
|---|---|
| Throughput 중요, latency 무관 | Parallel GC (JVM) |
| 균형 (default) | G1GC (JVM), V8 default |
| 매 sub-ms pause | ZGC (JVM 21+), Shenandoah |
| Embedded / RT | Manual / arena allocator |
| Functional language | Generational copying (OCaml, Erlang per-process) |
**기본값**: 매 generational + concurrent mark-sweep (G1, V8 Orinoco) — 매 modern runtime 의 standard.
## 🔗 Graph
- 부모: [[Garbage Collection]] · [[Memory Management]]
- 변형: [[Reference Counting]]
- 응용: [[V8 Engine]] · [[JVM]]
- Adjacent: [[Write Barrier]] · [[Memory_Leaks]]
## 🤖 LLM 활용
**언제**: GC algorithm 설계, runtime 선택, GC tuning 의 이론적 basis 가 필요할 때.
**언제 X**: application-level memory leak 추적 — 매 [[Memory_Leaks]] 가 더 직접적.
## ❌ 안티패턴
- **"GC = no leaks"**: 매 reachable leak 이 매 GC lang 의 가장 흔한 issue.
- **잦은 manual gc()**: 매 runtime 의 heuristic 보다 못함, 매 throughput 떨어뜨림.
- **거대 heap**: 매 mark phase 가 heap size 에 비례 — 매 큰 heap = 큰 pause (non-concurrent GC).
- **Finalizer 의존**: 매 unpredictable timing — 매 RAII / try-with-resources 가 정답.
## 🧪 검증 / 중복
- Verified (McCarthy 1960, "GC Handbook" Jones et al, V8 blog, OpenJDK docs).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — 2-phase + tri-color + V8/JVM/CPython 패턴 |