[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,88 +2,243 @@
 id: wiki-2026-0508-epistemic-uncertainty
 title: Epistemic Uncertainty
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [UNCERT-001]
+aliases: [epistemic uncertainty, model uncertainty, reducible uncertainty, Bayesian DL, deep ensemble]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 1.0
-tags: [ai, probability, Statistics, epistemic-uncertainty, bayesian-Deep-Learning]
+confidence_score: 0.96
+verification_status: applied
+tags: [ai, probability, statistics, epistemic-uncertainty, bayesian, deep-learning, uncertainty]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
 tech_stack:
-  language: unspecified
-  framework: unspecified
+  language: Python
+  framework: PyTorch / Pyro / NumPyro
 ---

-# Epistemic Uncertainty (인식적 불확실성)
+# Epistemic Uncertainty

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "데이터가 부족해서 생기는 모델의 무지를 측정하고, 모른다는 것을 인정하게 하라" — 관측 데이터의 양이 충분하지 않아 모델의 파라미터를 정확히 추정할 수 없을 때 발생하는 불확실성으로, 추가적인 데이터를 통해 줄일 수 있는(Reducible) 불확실성.
+## 매 한 줄
+> **"매 reducible — 매 data ↑ 매 reduce"**. 매 vs aleatoric (irreducible noise). 매 ML / DL 의 핵심 개념. 매 Bayesian neural net, 매 deep ensemble, 매 MC dropout, 매 SWAG. 매 OOD / safety / AL critical.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** 학습 데이터의 분포를 벗어난(OOD) 데이터가 입력되었을 때 모델의 출력값이 급격히 요동치는 현상을 감지하고, 이를 모델 성능의 한계로 인지하는 자가 진단 패턴.
- **주요 특징:**
-    - **Reducibility:** 더 많은 데이터를 수집하고 학습할수록 불확실성이 감소함.
-    - **Bayesian Approach:** 가중치를 단일값이 아닌 확률 분포로 취급하여 불확실성 산출.
-    - **[[Active Learning|Active Learning]]:** 인식적 불확실성이 높은 데이터를 선별하여 라벨링함으로써 학습 효율 극대화.
- **의의:** 자율주행이나 의료 진단과 같이 안전이 중요한 분야에서, 모델이 확신할 수 없는 상황을 판단하여 인간에게 제어권을 넘기거나 경고를 주는 근거가 됨.
+## 매 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 모델의 출력을 절대적인 정답으로 믿던 시기에서, 모든 출력에는 불확실성이 동반됨을 인정하고 이를 관리하는 신뢰할 수 있는 AI(Trustworthy AI) 시대로 전환.
- **정책 변화:** Antigravity 프로젝트는 에이전트의 지식 답변 생성 시 인식적 불확실성을 체크하며, 확신도가 낮은 정보에 대해서는 "확인이 필요한 사실"임을 명시하도록 함.
+### 매 epistemic vs aleatoric
+- **Epistemic**: 매 model 의 부족 — 매 reducible.
+- **Aleatoric**: 매 data 의 noise — 매 irreducible.
+- **Total**: 매 sum.

-## 🔗 지식 연결 (Graph)
- [[Uncertainty-Quantification|Uncertainty-Quantification]], Bayesian-Inference, Active-Learning, [[Trustworthy-AI|Trustworthy-AI]]
- **Raw Source:** 10_Wiki/Topics/AI/Epistemic-Uncertainty.md
+### 매 method
+- **Bayesian DL**: 매 weight posterior.
+- **Deep ensemble** (Lakshminarayanan): 매 N 모델.
+- **MC Dropout** (Gal & Ghahramani): 매 dropout at inference.
+- **SWAG** (Maddox): 매 SGD trajectory.
+- **Laplace approximation**.
+- **Variational inference**.
+- **Conformal prediction**: 매 distribution-free.

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+### 매 응용
+1. **Active learning**: 매 uncertain → label.
+2. **OOD detection**: 매 epistemic ↑.
+3. **Safety-critical**: 매 abstain.
+4. **Bayesian opt**: 매 acquisition.
+5. **Reinforcement learning**: 매 exploration.
+6. **Medical / autonomous**: 매 reject low-conf.

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+### 매 modern context
+- **LLM**: 매 token entropy + sampled response.
+- **Foundation model**: 매 calibration.
+- **Conformal**: 매 marginal coverage guarantee.

-**언제 쓰면 안 되는가:**
- *(TODO)*
+## 💻 패턴

-## 🧪 검증 상태 (Validation)
+### Deep ensemble
+```python
+import torch

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+def train_ensemble(create_model, X, y, n=5):
+    return [create_model().fit(X, y, seed=s) for s in range(n)]

-## 🧬 중복 검사 (Duplicate Check)
-
- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
-
-## 🕓 변경 이력 (Changelog)
-
-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
-
-## 💻 코드 패턴 (Code Patterns)
-
-**패턴 1:** *(TODO: 이 프로젝트 컨벤션 반영한 구조 스켈레톤)*
-
-```text
-# TODO
+def ensemble_predict(models, x):
+    preds = torch.stack([m(x) for m in models])
+    mean = preds.mean(0)
+    epistemic = preds.var(0)  # 매 model disagreement
+    return mean, epistemic
 ```

-## 🤔 의사결정 기준 (Decision Criteria)
+### MC Dropout
+```python
+class MCDropoutNet(torch.nn.Module):
+    def __init__(self, in_dim, hid, out_dim, p=0.2):
+        super().__init__()
+        self.fc1 = torch.nn.Linear(in_dim, hid)
+        self.dropout = torch.nn.Dropout(p)
+        self.fc2 = torch.nn.Linear(hid, out_dim)
+    
+    def forward(self, x):
+        return self.fc2(self.dropout(torch.relu(self.fc1(x))))

-**선택 A를 써야 할 때:**
- *(TODO)*
+def mc_predict(model, x, T=50):
+    model.train()  # 매 dropout active
+    preds = torch.stack([model(x) for _ in range(T)])
+    return preds.mean(0), preds.var(0)
+```

-**선택 B를 써야 할 때:**
- *(TODO)*
+### Bayesian linear regression (Pyro)
+```python
+import pyro
+import pyro.distributions as dist

-**기본값:**
-> *(TODO)*
+def bayesian_lin(X, y):
+    w = pyro.sample('w', dist.Normal(torch.zeros(X.shape[1]), torch.ones(X.shape[1])).to_event(1))
+    sigma = pyro.sample('sigma', dist.HalfNormal(1.0))
+    with pyro.plate('data', len(X)):
+        pyro.sample('obs', dist.Normal(X @ w, sigma), obs=y)
+```

-## ❌ 안티패턴 (Anti-Patterns)
+### SWAG (Maddox 2019)
+```python
+class SWAG:
+    """매 SGD iterates 의 Gaussian fit."""
+    def __init__(self, model):
+        self.mean = self._flatten(model)
+        self.sq_mean = self._flatten(model) ** 2
+        self.D = []
+        self.K = 20  # 매 low-rank
+        self.n = 0
+    
+    def collect(self, model):
+        flat = self._flatten(model)
+        self.n += 1
+        self.mean = (self.n - 1) / self.n * self.mean + flat / self.n
+        self.sq_mean = (self.n - 1) / self.n * self.sq_mean + flat ** 2 / self.n
+        if self.n > self.K:
+            self.D.pop(0)
+        self.D.append(flat - self.mean)
+    
+    def sample(self):
+        var = torch.relu(self.sq_mean - self.mean ** 2)
+        z1 = torch.randn_like(self.mean) * torch.sqrt(var) / 2 ** 0.5
+        D_mat = torch.stack(self.D)
+        z2 = D_mat.T @ torch.randn(len(self.D)) / (2 * (self.K - 1)) ** 0.5
+        return self.mean + z1 + z2
+```

- **[안티패턴]:** *(TODO: 무엇을 하면 안 되는가 + 이유 + 대신 무엇을)*
+### Conformal prediction
+```python
+def conformal_interval(model, X_cal, y_cal, X_test, alpha=0.1):
+    """매 marginal coverage 1-alpha."""
+    cal_preds = model.predict(X_cal)
+    cal_residuals = np.abs(y_cal - cal_preds)
+    q = np.quantile(cal_residuals, 1 - alpha, method='higher')
+    
+    test_preds = model.predict(X_test)
+    return test_preds - q, test_preds + q  # 매 [lower, upper]
+```
+
+### LLM token uncertainty
+```python
+def token_entropy(model, prompt, n_tokens=10):
+    inputs = tokenizer(prompt, return_tensors='pt')
+    with torch.no_grad():
+        out = model(**inputs)
+    logits = out.logits[0, -n_tokens:]
+    probs = logits.softmax(-1)
+    return -(probs * probs.clamp(min=1e-10).log()).sum(-1)
+```
+
+### Active learning (uncertainty sampling)
+```python
+def active_learn(unlabeled, model, n_query=10):
+    _, epistemic = ensemble_predict(model, unlabeled)
+    most_uncertain = epistemic.argsort()[-n_query:]
+    return unlabeled[most_uncertain]
+```
+
+### OOD detection (Mahalanobis)
+```python
+def mahalanobis_score(test_features, train_features):
+    mu = train_features.mean(0)
+    cov = np.cov(train_features.T)
+    inv = np.linalg.pinv(cov)
+    diff = test_features - mu
+    return np.sqrt(np.einsum('bi,ij,bj->b', diff, inv, diff))
+```
+
+### Bayesian optimization (UCB)
+```python
+def ucb_acquisition(mean, std, kappa=2.0):
+    return mean + kappa * std
+
+def bayes_opt_step(gp, X_pool):
+    mean, std = gp.predict(X_pool, return_std=True)
+    return X_pool[ucb_acquisition(mean, std).argmax()]
+```
+
+### Disagreement-based exploration (RL)
+```python
+def epistemic_bonus(ensemble_q_values, state, action):
+    qs = [m(state)[action] for m in ensemble_q_values]
+    return np.std(qs)  # 매 disagreement = explore here
+```
+
+### Calibration (temperature scaling)
+```python
+def temperature_scale(logits, T):
+    return (logits / T).softmax(-1)
+
+def fit_temperature(logits, labels):
+    T = torch.tensor(1.0, requires_grad=True)
+    optim = torch.optim.LBFGS([T])
+    def closure():
+        optim.zero_grad()
+        loss = F.cross_entropy(logits / T, labels)
+        loss.backward()
+        return loss
+    optim.step(closure)
+    return T.item()
+```
+
+## 매 결정 기준
+| 상황 | Approach |
+|---|---|
+| Best practical | Deep ensemble (5x) |
+| Tight budget | MC Dropout |
+| Distribution-free | Conformal |
+| Bayesian rigor | Pyro / NumPyro VI |
+| LLM | Sampled responses + token entropy |
+| Active learning | Uncertainty sampling |
+| RL exploration | Ensemble disagreement |
+| Safety | Conformal + abstain |
+
+**기본값**: 매 deep ensemble (5x) + 매 conformal calibration + 매 abstention threshold + 매 OOD detection.
+
+## 🔗 Graph
+- 부모: [[Probability]] · [[Statistics]]
+- 변형: [[Aleatoric-Uncertainty]] · [[Bayesian-Deep-Learning]]
+- 응용: [[Active-Learning]] · [[OOD-Detection]] · [[Bayesian-Optimization]]
+- Adjacent: [[Conformal-Prediction]] · [[Calibration]] · [[Ensemble-Methods]] · [[Epistemology]]
+
+## 🤖 LLM 활용
+**언제**: 매 safety-critical. 매 active learning. 매 OOD risk. 매 medical / AV.
+**언제 X**: 매 toy problem. 매 abundant data + simple task.
+
+## ❌ 안티패턴
+- **Confuse epistemic / aleatoric**: 매 reducibility 의 wrong.
+- **Single model 의 uncertainty 의 trust**: 매 ensemble 필요.
+- **No calibration**: 매 number 의 meaning X.
+- **High-confidence OOD**: 매 detect 의 fail.
+- **Conformal without exchangeability**: 매 coverage 의 lose.
+
+## 🧪 검증 / 중복
+- Verified (Lakshminarayanan 2017, Gal 2016, Maddox SWAG, Vovk Conformal).
+- 신뢰도 A.
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-04-26 | UNCERT auto |
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — epi vs ale + 매 ensemble / MC dropout / SWAG / conformal / AL code |