Files
2nd/10_Wiki/Topics/DevOps_and_Security/Flame_Graphs.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

5.8 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-flame-graphs Flame Graphs 10_Wiki/Topics verified self
Flamegraph
Stack Trace Visualization
Brendan Gregg Flame Graph
none A 0.9 applied
profiling
performance
observability
perf
ebpf
2026-05-10 pending
language framework
rust perf-pyspy-pprof

Flame Graphs

매 한 줄

"매 stack trace 의 SVG-ified hierarchy". Brendan Gregg (2011) 가 만든 매 visualization — 매 x축은 alphabetical (NOT time), 매 y축은 stack depth, 매 width 는 sample count. 매 hot path 가 매 wide flat plateau 로 즉시 보임. 매 2026 현재 perf, eBPF, py-spy, async-profiler, pprof, Pyroscope 등 매 모든 profiler 가 native output.

매 핵심

매 읽는 법

  • Width = time spent (sample count proportional). 매 wide = hot.
  • Y = stack depth. 매 bottom = entry, top = leaf.
  • Color = arbitrary (typically random hue per function — visual separation only).
  • Plateau at top = leaf function 의 CPU bound.
  • Tower = deep call chain (recursion 또는 framework overhead).

매 variant

  • CPU flame graph: 매 on-CPU sample 만 — classic.
  • Off-CPU flame graph: 매 blocked time (I/O, lock wait) — 매 latency 분석.
  • Differential flame graph: 매 두 profile 의 diff — red = slower, blue = faster.
  • Icicle (inverted): top-down — 매 entry-point 분석에 좋음.
  • Continuous profiling: 매 Pyroscope / Grafana Phlare 가 매 production 에 항상 켜짐.

매 도구 매핑

  1. Linux native: perf record -F 99 -g + Brendan Gregg's FlameGraph perl script.
  2. eBPF: bcc/profile, parca-agent — kernel + user 통합.
  3. Python: py-spy record -o flame.svg --pid $PID.
  4. JVM: async-profiler -e cpu -d 30 -f flame.html $PID.
  5. Go: go tool pprof -http=:8080 cpu.prof (built-in flame graph).
  6. Node.js: 0x or clinic flame.

💻 패턴

Linux perf → flame graph

# 1. Sample 99 Hz for 30s, capture stacks
sudo perf record -F 99 -a -g -- sleep 30

# 2. Convert to folded format
sudo perf script | \
  ~/FlameGraph/stackcollapse-perf.pl > out.folded

# 3. Render SVG
~/FlameGraph/flamegraph.pl out.folded > flame.svg

# Open in browser → click to zoom, search regex highlights

Differential flame graph (before/after)

~/FlameGraph/stackcollapse-perf.pl < before.perf > before.folded
~/FlameGraph/stackcollapse-perf.pl < after.perf  > after.folded
~/FlameGraph/difffolded.pl before.folded after.folded | \
  ~/FlameGraph/flamegraph.pl --negate > diff.svg

Continuous profiling with Pyroscope (Go)

import "github.com/grafana/pyroscope-go"

func main() {
    pyroscope.Start(pyroscope.Config{
        ApplicationName: "checkout-service",
        ServerAddress:   "http://pyroscope:4040",
        Logger:          pyroscope.StandardLogger,
        Tags:            map[string]string{"region": "us-west-2"},
        ProfileTypes: []pyroscope.ProfileType{
            pyroscope.ProfileCPU,
            pyroscope.ProfileAllocObjects,
            pyroscope.ProfileInuseObjects,
        },
    })
    runServer()
}

py-spy on running Python service

# 30s sample, draw flame graph
py-spy record -o flame.svg --pid 12345 --duration 30 --rate 100

# Native + Python frames combined
py-spy record -o flame.svg --pid 12345 --native

# Top-like live view
py-spy top --pid 12345

async-profiler for JVM

# CPU profile (30s) → flamegraph HTML
./profiler.sh -e cpu -d 30 -f flame.html $(jps | grep MyApp | awk '{print $1}')

# Allocation profile
./profiler.sh -e alloc -d 60 -f alloc.html $PID

# Wall-clock (off-CPU + on-CPU)
./profiler.sh -e wall -t -d 30 -f wall.html $PID

Off-CPU flame graph (eBPF / bcc)

# Capture off-CPU stacks (blocked time) for 30s
sudo /usr/share/bcc/tools/offcputime -df -p $PID 30 > offcpu.folded
~/FlameGraph/flamegraph.pl --color=io --title="Off-CPU" \
  offcpu.folded > offcpu.svg

pprof flame graph (Go built-in)

import _ "net/http/pprof"

go func() { http.ListenAndServe("localhost:6060", nil) }()

// Then on dev machine:
// go tool pprof -http=:8080 http://service:6060/debug/pprof/profile?seconds=30
// → opens browser, click "View" → "Flame Graph"

매 결정 기준

상황 Approach
Production continuous Pyroscope / Grafana Phlare / Polar Signals
Linux ad-hoc perf + FlameGraph
Python py-spy (zero-instrumentation)
JVM async-profiler (allocation + CPU + wall)
Go built-in pprof + go tool pprof
Node 0x or clinic flame
Latency / blocked Off-CPU flame graph (eBPF)

기본값: 매 production 에 Pyroscope + 매 dev 에 native profiler.

🔗 Graph

🤖 LLM 활용

언제: 매 flame graph 의 hot frame 식별 + optimization 제안, folded text → 자연어 summary, differential interpretation. 언제 X: 매 visual exact pixel reading — 매 SVG 자체 사용.

안티패턴

  • Sampling rate too low: 매 19 Hz — 매 short hot function miss. 매 99 Hz 표준.
  • Without -g (no callgraphs): 매 perf record -g 누락 — 매 frames frame 만 보임.
  • No frame pointers (Go ≤1.20, glibc): 매 stack unwind 실패 — -fno-omit-frame-pointer 또는 DWARF.
  • Reading width as time order: 매 x축은 time 의 X — alphabetical sort.
  • Production profiling once a year: 매 continuous 의 가치를 놓침.

🧪 검증 / 중복

  • Verified (Brendan Gregg 2011, Pyroscope/Grafana Labs 2026).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — flame graph reading guide + perf/py-spy/pprof recipes