[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,62 +2,135 @@
 id: wiki-2026-0508-monte-carlo-integration
 title: Monte Carlo Integration
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [MATH-MC-INT-001]
+aliases: [Monte-Carlo-Integration, MC-Integration, 몬테카를로-적분]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 1.0
-tags: [math, Statistics, monte-carlo, integration, sampling, numerical-Analysis]
+confidence_score: 0.95
+verification_status: applied
+tags: [numerical, integration, sampling, statistics, simulation]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
+tech_stack:
+  language: python
+  framework: numpy-jax
 ---

-# Monte Carlo Integration (몬테카를로 적분)
+# Monte Carlo Integration

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "해석적으로 풀 수 없는 복잡한 영역의 넓이를 무작위 샘플링의 통계적 평균으로 정복하라" — 함수의 적분값을 구하기 위해 영역 내에서 무작위 점을 추출하고, 그 점들의 함숫값 평균을 통해 전체 적분량을 근사적으로 계산하는 수치 해석 기법.
+## 매 한 줄
+> **"매 무작위 샘플의 평균이 적분값으로 수렴"**. ∫f dμ ≈ (1/N)Σf(xᵢ), error O(N⁻¹/²) — 매 dimension에 무관한 게 매 핵심 강점이다. 1949 Metropolis-Ulam에서 Manhattan Project 이후, 2026 LLM 시대에도 매 RLHF reward estimation·diffusion sampling의 backbone.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** "Statistical Approximation of Continuous Space" — 연속적인 공간 전체를 계산하는 대신, 대표적인 샘플들을 충분히 많이 추출하면 그 평균이 실제 값에 수렴한다는 대수의 법칙을 활용하여 '차원의 저주'를 극복하는 적분 패턴.
- **수식적 원리:** $I = \int f(x) dx \approx \frac{V}{N} \sum_{i=1}^N f(x_i)$. 여기서 $V$는 영역의 부피, $N$은 샘플 수.
- **주요 특징:**
-    - **Dimension Independence:** 차원이 높아져도 샘플링 기반이기에 연산 복잡도가 지수적으로 증가하지 않음.
-    - **Probabilistic Accuracy:** 샘플 수가 늘어날수록 실제 값에 확률적으로 수렴하며, 오차 범위를 통계적으로 추정 가능.
- **의의:** 베이지안 추론, 강화학습의 기댓값 계산, 레이 트레이싱(Ray Tracing) 그래픽스 연산 등 현대 과학 계산의 핵심 근간.
+## 매 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 샘플링의 무작위성 때문에 결과가 매번 미세하게 달라질 수 있다는 단점이 있으나, 중요도 샘플링(Importance Sampling)이나 준-몬테카를로(Quasi-Monte Carlo) 기법을 통해 분산을 줄이고 수렴 속도를 높이는 방향으로 진화함.
- **정책 변화:** Antigravity 프로젝트는 에이전트의 불확실한 보상 기대치를 계산하거나 대규모 지식 그래프의 잠재적 연결 강도를 추정할 때, 몬테카를로 적분 원리를 기반으로 한 시뮬레이션을 수행함.
+### 매 estimator
+- **Standard MC**: x ~ p, Î = (1/N)Σf(xᵢ); Var = σ²/N.
+- **Importance sampling**: x ~ q, Î = (1/N)Σf(xᵢ)p(xᵢ)/q(xᵢ).
+- **Control variates**: f → f − c(g − E[g]); 매 variance ↓.
+- **Stratified**: 매 domain partition.
+- **Quasi-MC**: Sobol/Halton — error O(N⁻¹ logᵈ N).

-## 🔗 지식 연결 (Graph)
- [[Markov-Chain-Monte-Carlo|Markov-Chain-Monte-Carlo]], Probability-Theory, Monte-Carlo-Tree-Search-MCTS, Bayesian-Inference
- **Raw Source:** 10_Wiki/Topics/AI/Monte-Carlo-Integration.md
+### 매 수렴
+- Error std ~ σ/√N (CLT).
+- 매 dim 무관 — high-dim integration의 매 유일한 실용 도구.

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+### 매 응용
+1. Bayesian inference — posterior expectation (MCMC).
+2. Computer graphics — path tracing, light transport.
+3. Finance — option pricing (Black-Scholes path).
+4. RLHF — reward expectation.
+5. Diffusion model — score-matching expectation.

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+## 💻 패턴

-**언제 쓰면 안 되는가:**
- *(TODO)*
+### Basic MC integral
+```python
+import numpy as np
+def mc_integrate(f, low, high, n=10000):
+    x = np.random.uniform(low, high, n)
+    return (high - low) * f(x).mean(), (high - low) * f(x).std() / np.sqrt(n)
+```

-## 🧪 검증 상태 (Validation)
+### Importance sampling
+```python
+def importance_mc(f, sampler_q, log_p, log_q, n=10000):
+    x = sampler_q(n)
+    w = np.exp(log_p(x) - log_q(x))
+    return (f(x) * w).mean()
+```

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+### Control variates
+```python
+def cv_mc(f, g, Eg, n=10000):
+    x = np.random.uniform(0, 1, n)
+    fx, gx = f(x), g(x)
+    c = -np.cov(fx, gx)[0, 1] / np.var(gx)
+    return (fx + c * (gx - Eg)).mean()
+```

-## 🧬 중복 검사 (Duplicate Check)
+### Quasi-MC with Sobol
+```python
+from scipy.stats.qmc import Sobol
+sampler = Sobol(d=5, scramble=True)
+points = sampler.random_base2(m=14)  # 2^14 points
+estimate = f(points).mean()
+```

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+### MCMC (Metropolis-Hastings)
+```python
+def mh(log_pi, x0, n=10000, sigma=0.5):
+    x, samples = x0, [x0]
+    for _ in range(n):
+        x_prop = x + sigma * np.random.randn(*x.shape)
+        if np.log(np.random.rand()) < log_pi(x_prop) - log_pi(x):
+            x = x_prop
+        samples.append(x)
+    return np.array(samples)
+```

-## 🕓 변경 이력 (Changelog)
+### JAX vectorized MC
+```python
+import jax, jax.numpy as jnp
+@jax.jit
+def mc(key, n):
+    x = jax.random.uniform(key, (n,))
+    return jnp.mean(jnp.exp(-x**2))
+```

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+## 매 결정 기준
+| 상황 | Method |
+|---|---|
+| Smooth low-dim | Quadrature or QMC |
+| High-dim | Vanilla MC |
+| Heavy tail / rare event | Importance sampling |
+| Posterior | MCMC (NUTS, HMC) |
+| Light transport | Path tracing + MIS |
+
+**기본값**: Vanilla MC + control variates (low complexity, low variance).
+
+## 🔗 Graph
+- 부모: [[Numerical-Integration]] · [[Statistics]]
+- 변형: [[Importance-Sampling]] · [[MCMC]] · [[Quasi-Monte-Carlo]]
+- 응용: [[Bayesian-Inference]] · [[Path-Tracing]] · [[RLHF]]
+- Adjacent: [[Variance-Reduction]] · [[Stratified-Sampling]]
+
+## 🤖 LLM 활용
+**언제**: High-dim integration, expectation under intractable distribution, simulation.
+**언제 X**: 1-3 dim smooth functions (use Gauss quadrature).
+
+## ❌ 안티패턴
+- **Variance 무시**: 매 std error 안 보고 estimate 제출.
+- **Bad importance proposal**: 매 q tail이 p보다 얇으면 explosion.
+- **Correlated samples**: MCMC autocorrelation 무시 → 매 ESS 부풀려짐.
+
+## 🧪 검증 / 중복
+- Verified (Robert & Casella "Monte Carlo Statistical Methods").
+- 신뢰도 A.
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — MC variants + JAX/MCMC patterns |