[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,118 +2,181 @@
 id: wiki-2026-0508-bayesian-inference
 title: Bayesian Inference
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: []
+aliases: [Bayesian Inference, Bayesian Statistics, Posterior Inference]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 0.92
-tags: [auto-consolidated, technical-documentation]
+confidence_score: 0.95
+verification_status: applied
+tags: [statistics, ml, probabilistic, mcmc]
 raw_sources: []
-last_reinforced: 2026-05-08
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
 tech_stack:
-  language: unspecified
-  framework: unspecified
+  language: python
+  framework: pymc,numpyro,stan
 ---

-# Bayesian-Inference (베이지안 추론)
+# Bayesian Inference

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "믿음은 고정된 것이 아니라 정보에 따라 진화한다." 기존의 배경 지식(Prior)에 새로운 근거(Evidence)를 더해 더 정확한 진실(Posterior)에 다가가는 통계학적 통찰이다.
+## 매 한 줄
+> **"매 prior + likelihood = posterior — 매 belief 의 evidence 에 의해 의 update"**. Bayes 1763 의 origin, 20세기 frequentist 의 dominance, 2026 의 NumPyro/PyMC + GPU MCMC + variational inference 의 mainstream — 매 LLM uncertainty quantification 의 backbone.

---
+## 매 핵심

-베이지안 추론(Bayesian Inference)은 베이즈 정리(Bayes' Theorem)를 바탕으로, 새로운 증거가 수집될 때마다 가설의 확률(신뢰도)을 지속적으로 갱신해 나가는 통계적 추론 방법론입니다 [1, 2]. 이는 지능 시스템이 불확실한 환경에서 점진적으로 학습하고 세계관을 수정해 나가는 핵심 원리입니다 [1].
+### 매 Bayes rule
+P(θ|D) = P(D|θ) P(θ) / P(D)
+- **P(θ)**: prior — 매 data 이전 의 belief.
+- **P(D|θ)**: likelihood — 매 model 의 data fit.
+- **P(θ|D)**: posterior — updated belief.
+- **P(D)**: evidence (marginal likelihood).

-## 📖 구조화된 지식 (Synthesized Content)
- **Prior Probability (사전 확률)**:
-    - 새로운 데이터를 보기 전에 우리가 이미 알고 있는 지식이나 가설의 확률.
- **Likelihood (우도)**:
-    - 어떤 가설이 참일 때, 현재 관찰된 데이터가 나타날 확률.
- **Posterior Probability (사후 확률)**:
-    - 새로운 데이터를 반영한 후 업데이트된 우리의 최종 믿음.
- **Application**:
-    - 스팸 메일 필터링, 의료 진단, 자율주행 차의 센서 융합 등 불확실성이 큰 환경의 의사결정에 필수적이다.
+### 매 4 inference 방법
+- **Conjugate**: closed-form (Beta-Bernoulli · Gaussian-Gaussian).
+- **MCMC**: HMC · NUTS · Gibbs — exact (asymptotic) · slow.
+- **Variational (VI)**: posterior 의 simpler family 의 approximate — fast · biased.
+- **Sequential MC**: particle filter — 매 streaming · state-space.

---
+### 매 응용
+1. A/B test (Bayesian alternative 의 frequentist).
+2. Hierarchical model (분류 multi-level).
+3. ML calibration · uncertainty (BNN · Gaussian process).
+4. LLM logit calibration · RAG confidence.

-* **베이지안 업데이트 (Bayesian Updating)**
-  - **사전 확률 (Prior)**: 새로운 데이터를 관찰하기 전의 기존 신뢰도입니다 [1, 3].
-  - **가능도 (Likelihood)**: 가설이 참일 때 관찰된 데이터가 나타날 확률입니다 [1].
-  - **사후 확률 (Posterior)**: 새로운 증거를 반영하여 업데이트된 최종 신뢰도입니다 [1, 4].
-  - 이 과정을 통해 시스템은 노이즈 섞인 데이터 하나에 일희일비하지 않고 전체적인 추세에 따라 점진적으로 지식을 수정합니다 [1, 5].
+## 💻 패턴

-* **지능 시스템에서의 활용**
-  - **능동적 학습 (Active Learning)**: 어떤 데이터가 사후 확률을 가장 크게 변화시킬지 판단하여 효율적으로 학습 대상을 선택합니다 [1].
-  - **베이지안 뇌 가설 (Bayesian Brain Hypothesis)**: 인간의 뇌가 감각 정보를 능동적으로 처리하고 확률 분포를 통해 미래를 예측한다는 이론으로, 현대 AI 상황 판단 모듈 설계의 모티브가 됩니다 [1, 6].
+### NumPyro: Bayesian linear regression (NUTS)
+```python
+import numpyro
+import numpyro.distributions as dist
+from numpyro.infer import NUTS, MCMC
+import jax.numpy as jnp
+import jax.random as random

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- 베이지안 추론은 '사전 확률'을 설정할 때 주관이 개입된다는 비판을 받기도 한다(빈도주의 통계학과의 논쟁). 하지만 데이터가 적은 초기 상태에서는 베이지만큼 강력한 예측 도구가 없다.
+def model(X, y=None):
+    alpha = numpyro.sample("alpha", dist.Normal(0., 10.))
+    beta  = numpyro.sample("beta",  dist.Normal(jnp.zeros(X.shape[1]), 1.))
+    sigma = numpyro.sample("sigma", dist.HalfNormal(1.))
+    mu    = alpha + X @ beta
+    numpyro.sample("obs", dist.Normal(mu, sigma), obs=y)

---
-
- **사전 확률의 주관성**: 초기 설정한 사전 확률(Prior)에 따라 결과가 달라질 수 있다는 비판이 있으나, 충분한 데이터가 쌓이면 사후 확률은 데이터의 본질에 수렴하게 됩니다 [1, 7].
- **연산 복잡도**: 복잡한 모델에서 베이지안 적분을 직접 계산하는 것은 매우 어렵기 때문에, MCMC(Markov Chain Monte Carlo)나 변분 추론(Variational Inference)과 같은 근사 기법이 널리 사용됩니다 [1].
-
-## 🔗 지식 연결 (Graph)
- Related: [[Automated-Reasoning|Automated-Reasoning]] , [[Behavioral-Economics|Behavioral-Economics]]
- Foundation: Computational Theory & Math/Information Theory
-
---
-
- **Related Topics**: 베이즈 정리 (Bayes' Theorem, 확률론 (Probability Theory), 능동적 학습 (Active Learning), 예측 코딩 (Predictive Coding
- **Projects/Contexts**: Antigravity 상황 판단 엔진, 초개인화 추천 알고리즘
-
---
-*Last updated: 2026-04-30*
-
-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
-
-**언제 이 지식을 쓰는가:**
- *(TODO)*
-
-**언제 쓰면 안 되는가:**
- *(TODO)*
-
-## 🧪 검증 상태 (Validation)
-
- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
-
-## 🧬 중복 검사 (Duplicate Check)
-
- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
-
-## 🕓 변경 이력 (Changelog)
-
-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
-
-## 💻 코드 패턴 (Code Patterns)
-
-**패턴 1:** *(TODO: 이 프로젝트 컨벤션 반영한 구조 스켈레톤)*
-
-```text
-# TODO
+mcmc = MCMC(NUTS(model), num_warmup=1000, num_samples=2000, num_chains=4)
+mcmc.run(random.PRNGKey(0), X, y)
+mcmc.print_summary()
 ```

-## 🤔 의사결정 기준 (Decision Criteria)
+### PyMC: Bayesian A/B test (Beta-Bernoulli)
+```python
+import pymc as pm

-**선택 A를 써야 할 때:**
- *(TODO)*
+with pm.Model():
+    p_a = pm.Beta("p_a", alpha=1, beta=1)
+    p_b = pm.Beta("p_b", alpha=1, beta=1)
+    pm.Binomial("obs_a", n=n_a, p=p_a, observed=conv_a)
+    pm.Binomial("obs_b", n=n_b, p=p_b, observed=conv_b)
+    diff = pm.Deterministic("lift", p_b - p_a)
+    idata = pm.sample(2000, tune=1000, chains=4)

-**선택 B를 써야 할 때:**
- *(TODO)*
+prob_b_better = (idata.posterior["lift"] > 0).mean().item()
+```

-**기본값:**
-> *(TODO)*
+### Hierarchical model (varying intercept)
+```python
+def hierarchical(X, group_idx, y=None, n_groups=10):
+    mu_a    = numpyro.sample("mu_a",    dist.Normal(0., 5.))
+    sigma_a = numpyro.sample("sigma_a", dist.HalfNormal(1.))
+    a       = numpyro.sample("a", dist.Normal(mu_a, sigma_a).expand([n_groups]))
+    beta    = numpyro.sample("beta", dist.Normal(0., 1.))
+    sigma   = numpyro.sample("sigma", dist.HalfNormal(1.))
+    mu      = a[group_idx] + beta * X
+    numpyro.sample("obs", dist.Normal(mu, sigma), obs=y)
+```

-## ❌ 안티패턴 (Anti-Patterns)
+### Variational inference (SVI)
+```python
+from numpyro.infer import SVI, Trace_ELBO
+from numpyro.infer.autoguide import AutoNormal

- **[안티패턴]:** *(TODO: 무엇을 하면 안 되는가 + 이유 + 대신 무엇을)*
+guide = AutoNormal(model)
+svi   = SVI(model, guide, numpyro.optim.Adam(1e-3), Trace_ELBO())
+state = svi.init(random.PRNGKey(0), X, y)
+
+for i in range(2000):
+    state, loss = svi.update(state, X, y)
+params = svi.get_params(state)
+```
+
+### Conjugate update (Beta-Bernoulli online)
+```python
+class BetaBernoulli:
+    def __init__(self, alpha=1, beta=1):
+        self.alpha, self.beta = alpha, beta
+
+    def update(self, success: bool):
+        self.alpha += int(success)
+        self.beta  += int(not success)
+
+    def mean(self):  return self.alpha / (self.alpha + self.beta)
+    def credible_interval(self, q=0.95):
+        from scipy.stats import beta
+        return beta.interval(q, self.alpha, self.beta)
+```
+
+### Bayesian neural net (Pyro Bayesian layer)
+```python
+import torch
+import pyro
+import pyro.nn as pnn
+
+class BNN(pnn.PyroModule):
+    def __init__(self, in_d, out_d):
+        super().__init__()
+        self.linear = pnn.PyroModule[torch.nn.Linear](in_d, out_d)
+        self.linear.weight = pnn.PyroSample(dist.Normal(0., 1.).expand([out_d, in_d]).to_event(2))
+        self.linear.bias   = pnn.PyroSample(dist.Normal(0., 1.).expand([out_d]).to_event(1))
+
+    def forward(self, x, y=None):
+        mean = self.linear(x).squeeze(-1)
+        with pyro.plate("data", x.shape[0]):
+            pyro.sample("obs", dist.Normal(mean, 0.1), obs=y)
+        return mean
+```
+
+## 매 결정 기준
+| 상황 | Method |
+|---|---|
+| Conjugate model · streaming | Conjugate update |
+| 매 small model · accurate posterior | NUTS/HMC |
+| Large data · fast approx | SVI · ADVI |
+| State-space · time-series | Particle filter |
+| Deep model · scale | BNN + variational · MC dropout |
+| 매 hyperparameter optimization | Gaussian process + acquisition |
+
+**기본값**: NumPyro + NUTS — 매 GPU/JAX 의 fast.
+
+## 🔗 Graph
+- 부모: [[Probability Theory]] · [[Statistical Inference]]
+- 변형: [[MCMC]] · [[Variational Inference]] · [[Empirical Bayes]]
+- 응용: [[Belief-System]] · [[Bayesian A/B Test]] · [[Bayesian Neural Network]]
+- Adjacent: [[Frequentist Inference]] · [[Causal Inference]]
+
+## 🤖 LLM 활용
+**언제**: model 의 prior · likelihood 의 spec 의 draft, posterior plot 의 interpret, NumPyro/PyMC code 의 generate.
+**언제 X**: 매 convergence diagnostic (R-hat · ESS · trace plot) — LLM 의 statistical judgment 의 unreliable, statistician 의 review 의 require.
+
+## ❌ 안티패턴
+- **Flat prior 의 always**: weak data + flat prior → unstable posterior. Weakly informative prior 의 use.
+- **No convergence check**: R-hat > 1.01 · ESS < 400 → posterior 의 invalid.
+- **Single chain MCMC**: multi-chain 의 mix check 의 mandatory.
+- **Posterior point-estimate 의 only**: 매 distribution 의 entirety 의 use — credible interval · posterior predictive.
+
+## 🧪 검증 / 중복
+- Verified (Gelman *Bayesian Data Analysis* 3rd, NumPyro/PyMC/Stan docs, McElreath *Statistical Rethinking* 2nd).
+- 신뢰도 A.
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — Bayes rule + 4 methods + NumPyro/PyMC patterns |