"매 stack trace 의 SVG-ified hierarchy". Brendan Gregg (2011) 가 만든 매 visualization — 매 x축은 alphabetical (NOT time), 매 y축은 stack depth, 매 width 는 sample count. 매 hot path 가 매 wide flat plateau 로 즉시 보임. 매 2026 현재 perf, eBPF, py-spy, async-profiler, pprof, Pyroscope 등 매 모든 profiler 가 native output.
매 핵심
매 읽는 법
Width = time spent (sample count proportional). 매 wide = hot.
Y = stack depth. 매 bottom = entry, top = leaf.
Color = arbitrary (typically random hue per function — visual separation only).
Plateau at top = leaf function 의 CPU bound.
Tower = deep call chain (recursion 또는 framework overhead).
매 variant
CPU flame graph: 매 on-CPU sample 만 — classic.
Off-CPU flame graph: 매 blocked time (I/O, lock wait) — 매 latency 분석.
Differential flame graph: 매 두 profile 의 diff — red = slower, blue = faster.
Icicle (inverted): top-down — 매 entry-point 분석에 좋음.
Continuous profiling: 매 Pyroscope / Grafana Phlare 가 매 production 에 항상 켜짐.
매 도구 매핑
Linux native: perf record -F 99 -g + Brendan Gregg's FlameGraph perl script.
eBPF: bcc/profile, parca-agent — kernel + user 통합.
Python: py-spy record -o flame.svg --pid $PID.
JVM: async-profiler -e cpu -d 30 -f flame.html $PID.
Go: go tool pprof -http=:8080 cpu.prof (built-in flame graph).
Node.js: 0x or clinic flame.
💻 패턴
Linux perf → flame graph
# 1. Sample 99 Hz for 30s, capture stacks
sudo perf record -F 99 -a -g -- sleep 30# 2. Convert to folded format
sudo perf script |\
~/FlameGraph/stackcollapse-perf.pl > out.folded
# 3. Render SVG
~/FlameGraph/flamegraph.pl out.folded > flame.svg
# Open in browser → click to zoom, search regex highlights
import_"net/http/pprof"gofunc(){http.ListenAndServe("localhost:6060",nil)}()// Then on dev machine:// go tool pprof -http=:8080 http://service:6060/debug/pprof/profile?seconds=30// → opens browser, click "View" → "Flame Graph"
매 결정 기준
상황
Approach
Production continuous
Pyroscope / Grafana Phlare / Polar Signals
Linux ad-hoc
perf + FlameGraph
Python
py-spy (zero-instrumentation)
JVM
async-profiler (allocation + CPU + wall)
Go
built-in pprof + go tool pprof
Node
0x or clinic flame
Latency / blocked
Off-CPU flame graph (eBPF)
기본값: 매 production 에 Pyroscope + 매 dev 에 native profiler.
언제: 매 flame graph 의 hot frame 식별 + optimization 제안, folded text → 자연어 summary, differential interpretation.
언제 X: 매 visual exact pixel reading — 매 SVG 자체 사용.
❌ 안티패턴
Sampling rate too low: 매 19 Hz — 매 short hot function miss. 매 99 Hz 표준.
Without -g (no callgraphs): 매 perf record -g 누락 — 매 frames frame 만 보임.
No frame pointers (Go ≤1.20, glibc): 매 stack unwind 실패 — -fno-omit-frame-pointer 또는 DWARF.
Reading width as time order: 매 x축은 time 의 X — alphabetical sort.
Production profiling once a year: 매 continuous 의 가치를 놓침.