f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
182 lines
5.2 KiB
Markdown
182 lines
5.2 KiB
Markdown
---
|
|
id: wiki-2026-0508-mark-sweep
|
|
title: Mark-Sweep GC
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [Mark and Sweep, Tracing GC, M&S]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [gc, memory, runtime, algorithm]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: multi
|
|
framework: jvm-v8-cpython
|
|
---
|
|
|
|
# Mark-Sweep GC
|
|
|
|
## 매 한 줄
|
|
> **"매 reachable 만 살리고 — 매 나머지는 sweep."**. Mark-Sweep 은 매 1960년 McCarthy 가 Lisp 위해 발명한 매 tracing GC algorithm. 매 root 부터 reachable object 를 mark 하고 매 unmarked 를 sweep. 매 reference counting 의 cycle 문제 해결 — 매 modern GC (V8, JVM, CPython generational) 의 base.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 2-phase
|
|
1. **Mark**: 매 GC root (stack, globals, registers) 부터 매 graph traverse, 매 reachable 에 mark bit 1.
|
|
2. **Sweep**: 매 heap 전체 scan, 매 unmarked object 를 free list 에 추가.
|
|
|
|
### 매 GC root
|
|
- 매 stack 의 local variable.
|
|
- 매 global / static variable.
|
|
- 매 CPU register.
|
|
- 매 JNI / native handle.
|
|
|
|
### 매 변형
|
|
- **Mark-Compact**: sweep 대신 매 살아있는 object 를 압축 → 매 fragmentation 해결.
|
|
- **Tri-color**: white / grey / black — 매 incremental / concurrent 가능.
|
|
- **Generational**: young (Eden + Survivor) / old — 매 most objects die young 가설.
|
|
|
|
### 매 stop-the-world (STW)
|
|
- 매 mark phase 동안 매 mutator (app thread) 정지 — 매 reference graph 일관성.
|
|
- 매 concurrent / incremental GC 는 매 write barrier 로 STW 최소화.
|
|
|
|
## 💻 패턴
|
|
|
|
### Pseudocode mark
|
|
```python
|
|
def mark(obj):
|
|
if obj is None or obj.marked:
|
|
return
|
|
obj.marked = True
|
|
for ref in obj.references:
|
|
mark(ref)
|
|
|
|
def gc():
|
|
# 1. Mark
|
|
for root in get_roots():
|
|
mark(root)
|
|
# 2. Sweep
|
|
for obj in heap:
|
|
if obj.marked:
|
|
obj.marked = False # reset for next cycle
|
|
else:
|
|
free(obj)
|
|
```
|
|
|
|
### Tri-color iterative mark
|
|
```python
|
|
WHITE, GREY, BLACK = 0, 1, 2
|
|
|
|
def mark_iterative():
|
|
# Initially all WHITE except roots → GREY
|
|
grey = list(get_roots())
|
|
for r in grey:
|
|
r.color = GREY
|
|
|
|
while grey:
|
|
obj = grey.pop()
|
|
for ref in obj.references:
|
|
if ref.color == WHITE:
|
|
ref.color = GREY
|
|
grey.append(ref)
|
|
obj.color = BLACK
|
|
|
|
# Sweep: WHITE → free, BLACK → reset to WHITE
|
|
for obj in heap:
|
|
if obj.color == WHITE:
|
|
free(obj)
|
|
else:
|
|
obj.color = WHITE
|
|
```
|
|
|
|
### V8 (JavaScript) GC config
|
|
```javascript
|
|
// Inspect V8 GC behavior
|
|
node --trace-gc app.js
|
|
// → [GC] Scavenge ... Mark-sweep ...
|
|
|
|
// Tune heap size
|
|
node --max-old-space-size=4096 app.js // 4GB
|
|
|
|
// Trigger GC manually (testing only)
|
|
node --expose-gc -e "global.gc()"
|
|
```
|
|
|
|
### JVM G1GC vs ZGC
|
|
```bash
|
|
# G1GC (default since Java 9): regional mark-sweep with compaction
|
|
java -XX:+UseG1GC -Xmx8g -XX:MaxGCPauseMillis=200 App
|
|
|
|
# ZGC (Java 21+): concurrent, sub-ms pause
|
|
java -XX:+UseZGC -Xmx32g App
|
|
# → most work concurrent with mutator, STW phases <1ms
|
|
```
|
|
|
|
### CPython gc module
|
|
```python
|
|
import gc
|
|
|
|
# CPython uses refcount + mark-sweep (for cycles only)
|
|
gc.collect() # force cycle collection
|
|
gc.get_count() # (gen0, gen1, gen2) — generational
|
|
gc.set_threshold(700, 10, 10) # tune trigger
|
|
|
|
# Detect cycles
|
|
import objgraph
|
|
objgraph.show_most_common_types(limit=20)
|
|
```
|
|
|
|
### Write barrier (concurrent GC)
|
|
```c
|
|
// When mutator writes a pointer, GC must know
|
|
// (so concurrent mark doesn't miss new references)
|
|
void write_field(Object* obj, int idx, Object* val) {
|
|
if (gc_is_marking && obj->color == BLACK && val->color == WHITE) {
|
|
val->color = GREY;
|
|
grey_queue_push(val);
|
|
}
|
|
obj->fields[idx] = val;
|
|
}
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | GC choice |
|
|
|---|---|
|
|
| Throughput 중요, latency 무관 | Parallel GC (JVM) |
|
|
| 균형 (default) | G1GC (JVM), V8 default |
|
|
| 매 sub-ms pause | ZGC (JVM 21+), Shenandoah |
|
|
| Embedded / RT | Manual / arena allocator |
|
|
| Functional language | Generational copying (OCaml, Erlang per-process) |
|
|
|
|
**기본값**: 매 generational + concurrent mark-sweep (G1, V8 Orinoco) — 매 modern runtime 의 standard.
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[Garbage Collection]] · [[Memory Management]]
|
|
- 변형: [[Reference Counting]]
|
|
- 응용: [[V8 Engine]] · [[JVM]]
|
|
- Adjacent: [[Write Barrier]] · [[Memory_Leaks]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: GC algorithm 설계, runtime 선택, GC tuning 의 이론적 basis 가 필요할 때.
|
|
**언제 X**: application-level memory leak 추적 — 매 [[Memory_Leaks]] 가 더 직접적.
|
|
|
|
## ❌ 안티패턴
|
|
- **"GC = no leaks"**: 매 reachable leak 이 매 GC lang 의 가장 흔한 issue.
|
|
- **잦은 manual gc()**: 매 runtime 의 heuristic 보다 못함, 매 throughput 떨어뜨림.
|
|
- **거대 heap**: 매 mark phase 가 heap size 에 비례 — 매 큰 heap = 큰 pause (non-concurrent GC).
|
|
- **Finalizer 의존**: 매 unpredictable timing — 매 RAII / try-with-resources 가 정답.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (McCarthy 1960, "GC Handbook" Jones et al, V8 blog, OpenJDK docs).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — 2-phase + tri-color + V8/JVM/CPython 패턴 |
|