[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -1,66 +1,274 @@
 ---
 id: wiki-2026-0508-bottlenecks
-title: Bottlenecks
+title: Bottlenecks (Performance & Process)
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [P-Reinforce-AUTO-BOTT-001]
+aliases: [병목, bottleneck, theory of constraints, TOC, critical path, profiling]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 0.98
-tags: [auto-reinforced, bottlenecks, Optimization, performance, constraint, Systems-Thinking]
+confidence_score: 0.93
+verification_status: applied
+tags: [performance, bottleneck, profiling, theory-of-constraints, optimization, scalability, latency]
 raw_sources: []
-last_reinforced: 2026-04-20
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
+tech_stack:
+  language: any
+  framework: profiling tools
 ---

-# [[Bottlenecks|Bottlenecks]]
+# Bottlenecks

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "시스템의 목을 조르는 곳: 아무리 다른 부분이 뛰어나도 결국 전체의 처리 속도를 결정해버리는 가장 취약한 지점이며, 최적화가 가장 시급하게 투입되어야 할 지도의 급소."
+## 📌 한 줄 통찰
+> **"매 system 의 throat"**. 매 weakest link 의 throughput 의 결정. 매 non-bottleneck 의 improve = 매 시간 낭비. 매 Goldratt's TOC: 매 5 step. 매 modern AI: 매 HBM bandwidth + 매 network 의 bottleneck.

-## 📖 구조화된 지식 (Synthesized Content)
-병목(Bottlenecks) 현상은 시스템의 일부분이 그 능력을 발휘하지 못해 전체 시스템의 흐름을 제한하는 상태를 의미합니다.
+## 📖 핵심

-1.  **주요 유형**:
-    *   **[[Hardware|Hardware]] Bottleneck**: CPU 연산 속도보다 데이터 읽기(I/O) 속도가 현격히 느린 경우.
-    *   **Software Bottleneck**: 비효율적인 알고리즘이나 블로킹 코드가 실행 시간을 잡아먹는 경우. ([[Blocking|Blocking]]과 연결)
-    *   **Human/Process Bottleneck**: 승인 프로세스가 너무 길거나 특정 전문가만 할 수 있는 작업이 밀려 있는 경우.
-2.  **해결 원칙 (TOC)**:
-    *   제약 이론(Theory of Constraints)에 따르면, 병목 지점이 아닌 곳을 개선하는 것은 시간 낭비에 불과함. 오직 병목 지점을 확장하거나 보호해야 전체 성과가 올라감.
+### 매 type
+1. **Hardware**: CPU / GPU / RAM / disk / network.
+2. **Software**: algorithm / blocking / lock contention.
+3. **Process**: approval / single point of expertise.
+4. **Data**: schema / indexing / partitioning.
+5. **Cognitive** (team): meeting / context-switch.

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌**: 과거에는 모든 부분을 골고루 개선하는 정책이 우수하다고 믿었으나, 현대의 시스템 최적화 정책은 '선택과 집중'을 통해 오직 병목 지점만을 정밀 타격하는 정책적 지능을 발휘함(RL Update).
- **정책 변화(RL Update)**: 거대 AI 모델의 학습 및 추론 정책에서, 알고리즘 개선보다 메모리 대역폭(HBM)이나 네트워크 대역폭이 실제 병목인 경우가 많아 하드웨어 인프라 확보 정책이 AI 경쟁력의 본질이 됨.
+### Theory of Constraints (Goldratt)
+1. **Identify** the bottleneck.
+2. **Exploit** it (use 100%).
+3. **Subordinate** non-bottleneck (don't over-feed).
+4. **Elevate** it (invest to widen).
+5. **Repeat** (new bottleneck emerges).

-## 🔗 지식 연결 (Graph)
- [[Blocking|Blocking]], [[Optimization|Optimization]], [[Theory of Constraints (TOC)|Theory of Constraints (TOC)]], [[Analysis|Analysis]], [[Scalability|Scalability]]
- **Modern Tech/Tools**: Performance profilers, Load [[Testing|Testing]] tools, Network analyzers.
---
+### Amdahl's Law (related)
+- 매 90% 의 100× → 매 전체 의 매 10× cap.
+- 매 bottleneck 의 X 의 의미.

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+### 매 hardware bottleneck 의 modern (LLM)
+- **HBM bandwidth**: 매 H100 = 매 3 TB/s. 매 LLM inference 의 dominant.
+- **NVLink**: 매 GPU-GPU.
+- **Network** (RDMA, InfiniBand): 매 distributed train.
+- **PCIe**: 매 GPU-CPU.
+- **Storage**: 매 NVMe vs spinning.
+- **Power / cooling**: 매 datacenter limit.

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+### 매 software bottleneck
+- **CPU-bound**: 매 compute heavy.
+- **I/O-bound**: 매 disk / network wait.
+- **Memory-bound**: 매 swap / cache miss.
+- **Lock contention**: 매 mutex.
+- **GIL** (Python): 매 single-thread CPU.
+- **N+1 query**: 매 ORM 의 typical.

-**언제 쓰면 안 되는가:**
- *(TODO)*
+### 매 detection
+- **Profiler**: cProfile, perf, async-profiler.
+- **Trace**: distributed tracing (Jaeger).
+- **Metric**: CPU/mem/disk/network util.
+- **APM**: Datadog, NewRelic.
+- **Flame graph**.
+- **Critical path**.

-## 🧪 검증 상태 (Validation)
+### 매 process bottleneck
+- 매 approval chain.
+- 매 single expert.
+- 매 environment provisioning.
+- 매 review SLA.
+- 매 meeting cadence.

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+→ 매 DORA Lead Time 의 component.

-## 🧬 중복 검사 (Duplicate Check)
+### 매 data bottleneck
+- 매 single hot row.
+- 매 missing index.
+- 매 cross-shard transaction.
+- 매 schema migration block.

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+### 매 distributed bottleneck (modern)
+- 매 leader 의 single (Raft, Paxos).
+- 매 cross-region call.
+- 매 sync replication.
+- 매 connection pool limit.

-## 🕓 변경 이력 (Changelog)
+## 💻 패턴

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+### Profile (Python cProfile)
+```python
+import cProfile, pstats
+
+def main():
+    expensive_call()
+    cheap_call()
+
+cProfile.run('main()', 'out.prof')
+stats = pstats.Stats('out.prof').sort_stats('cumulative')
+stats.print_stats(20)
+```
+
+### Linux perf (system-level)
+```bash
+# 매 CPU profile
+perf record -F 99 -p $PID -- sleep 10
+perf report
+
+# 매 flame graph
+perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > flame.svg
+```
+
+### Async profiler (JVM)
+```bash
+# 매 sample lock contention
+java -jar async-profiler.jar -e lock -d 30 -f lock.html $PID
+
+# 매 wall clock (I/O bound 도)
+java -jar async-profiler.jar -e wall -d 30 -f wall.html $PID
+```
+
+### N+1 detect (Django)
+```python
+from django.test.utils import CaptureQueriesContext
+from django.db import connection
+
+with CaptureQueriesContext(connection) as ctx:
+    posts = Post.objects.all()
+    for post in posts:
+        print(post.author.name)  # 매 N+1
+    
+    if len(ctx.captured_queries) > 5:
+        log(f'N+1 detected: {len(ctx.captured_queries)} queries')
+
+# 매 fix
+posts = Post.objects.select_related('author')  # 매 1 query
+```
+
+### GPU bottleneck profile (PyTorch)
+```python
+import torch.profiler as prof
+
+with prof.profile(
+    activities=[prof.ProfilerActivity.CPU, prof.ProfilerActivity.CUDA],
+    record_shapes=True,
+    profile_memory=True,
+) as p:
+    model(input)
+
+print(p.key_averages().table(sort_by='cuda_time_total', row_limit=20))
+
+# 매 HBM bandwidth bottleneck 의 reveal
+```
+
+### Lock contention detection
+```python
+import threading
+
+class LockMonitor:
+    def __init__(self, lock):
+        self.lock = lock
+        self.wait_times = []
+    
+    def __enter__(self):
+        start = time.time()
+        self.lock.acquire()
+        self.wait_times.append(time.time() - start)
+    
+    def __exit__(self, *args):
+        self.lock.release()
+    
+    def report(self):
+        if not self.wait_times: return
+        avg = sum(self.wait_times) / len(self.wait_times)
+        if avg > 0.1: log(f'Lock contention: avg wait {avg*1000}ms')
+```
+
+### Distributed trace (Jaeger)
+```python
+from opentelemetry import trace
+from opentelemetry.sdk.trace import TracerProvider
+from opentelemetry.exporter.jaeger.thrift import JaegerExporter
+
+trace.set_tracer_provider(TracerProvider())
+tracer = trace.get_tracer(__name__)
+
+@tracer.start_as_current_span('handle_request')
+def handle(req):
+    with tracer.start_as_current_span('db_query') as span:
+        span.set_attribute('db.statement', 'SELECT ...')
+        result = db.query(...)
+    return result
+```
+
+→ 매 시각적 bottleneck identify.
+
+### Process bottleneck (workflow analysis)
+```python
+def analyze_workflow(stage_durations):
+    """매 stage 별 의 throughput 의 비교."""
+    rates = {stage: 1 / dur for stage, dur in stage_durations.items()}
+    bottleneck = min(rates, key=rates.get)
+    
+    overall_rate = rates[bottleneck]
+    waste = sum(r - overall_rate for r in rates.values() if r > overall_rate)
+    
+    return {
+        'bottleneck': bottleneck,
+        'overall_rate_per_min': overall_rate * 60,
+        'capacity_wasted': waste,
+    }
+```
+
+### Critical path (DAG)
+```python
+import networkx as nx
+
+def critical_path(tasks):
+    """매 longest path through DAG."""
+    G = nx.DiGraph()
+    for task in tasks:
+        G.add_node(task.id, duration=task.duration)
+        for dep in task.deps:
+            G.add_edge(dep, task.id)
+    
+    # 매 longest path
+    return nx.dag_longest_path(G, weight='duration')
+```
+
+## 🤔 결정 기준
+| 증상 | Tool |
+|---|---|
+| Slow request | APM + distributed trace |
+| CPU pegged | Flame graph (perf) |
+| GPU underutilized | Memory bandwidth (PyTorch profiler) |
+| Slow query | EXPLAIN + slow query log |
+| Lock contention | async-profiler -e lock |
+| Long lead time | Process / DORA analysis |
+| Thundering herd | Coordination check |
+
+**기본값**: 매 measure first. 매 hypothesis-based optimize.
+
+## 🔗 Graph
+- 부모: [[Performance-Engineering]] · [[System-Design]]
+- 변형: [[CPU-Bound]] · [[I-O-Bound]] · [[Memory-Bound]] · [[Lock-Contention]]
+- 응용: [[Theory-of-Constraints]] · [[Amdahls-Law]] · [[Critical-Path]]
+- Tool: [[Profiling]] · [[Flame-Graph]] · [[Distributed-Tracing]] · [[APM]]
+- Adjacent: [[Optimization]] · [[Scalability]] · [[DORA-Metrics]] · [[Bottleneck-Analysis]]
+
+## 🤖 LLM 활용
+**언제**: 매 performance optimization. 매 capacity planning. 매 incident root cause. 매 process improvement.
+**언제 X**: 매 hypothesis 없 의 optimize.
+
+## ❌ 안티패턴
+- **Optimize without measure**: 매 wrong place.
+- **Non-bottleneck improve**: 매 시간 waste (TOC).
+- **모든 part 의 평등 invest**: 매 ROI low.
+- **Single profile 의 trust**: 매 representative X.
+- **Process 의 "사람 의 fault"**: 매 system issue 가 대부분.
+- **Premature optimization**: 매 simplicity lose.
+
+## 🧪 검증 / 중복
+- Verified (Goldratt TOC, Knuth premature optimization, Brendan Gregg Systems Performance).
+- 신뢰도 A.
+- Related: [[Amdahls-Law]] · [[Theory-of-Constraints]] · [[Profiling]] · [[Critical-Path]].
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — type + TOC + 매 profile / N+1 / GPU / trace code |