Files
2nd/10_Wiki/Topics/Other/Bayesian-Updating.md
T
2026-05-10 22:08:15 +09:00

159 lines
5.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-bayesian-updating
title: Bayesian Updating
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Bayesian Inference, Posterior Update, Belief Updating]
duplicate_of: none
source_trust_level: A
confidence_score: 0.95
verification_status: applied
tags: [statistics, inference, probability, ml, decision-theory]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: Python
framework: PyMC / NumPyro
---
# Bayesian Updating
## 매 한 줄
> **"매 Posterior ∝ Likelihood × Prior — evidence 의 arrival 마다 belief 의 incremental refinement"**. Bayes (1763) 의 sermon 에서 출발 의, 2026 modern stack 의 PyMC 5, NumPyro 0.15, Stan 2.34 의 통한 millions-of-parameters posterior 의 NUTS / HMC sampling 의 routine.
## 매 핵심
### 매 공식
- **Bayes' rule**: `P(H|E) = P(E|H) × P(H) / P(E)`
- **Sequential update**: `posterior_t = likelihood_t × posterior_{t-1}`
- **Log-form** (numerical stability): `log P(H|E) = log P(E|H) + log P(H) - log P(E)`
### 매 conjugate priors
- BetaBinomial (CTR, conversion rate)
- GammaPoisson (event counts, arrival rate)
- NormalNormal (sensor fusion, A/B continuous metric)
- DirichletMultinomial (categorical preferences)
### 매 응용
1. A/B testing — early-stopping, peeking 의 robust handling.
2. Spam filter — Naive Bayes 의 incremental email update.
3. Robot localization — particle filter 의 prior 와 sensor likelihood 의 fuse.
4. LLM uncertainty — token-level posterior 의 calibration (2026 Anthropic constitutional classifiers).
## 💻 패턴
### BetaBinomial conjugate (CTR)
```python
from scipy import stats
import numpy as np
# Prior: Beta(1, 1) = uniform
alpha, beta = 1.0, 1.0
# Observe: 73 clicks out of 1000 impressions
clicks, impressions = 73, 1000
alpha_post = alpha + clicks
beta_post = beta + (impressions - clicks)
posterior = stats.beta(alpha_post, beta_post)
print(f"Posterior mean CTR: {posterior.mean():.4f}")
print(f"95% credible interval: {posterior.interval(0.95)}")
```
### Sequential update (online)
```python
def online_beta_update(alpha, beta, click: bool):
return (alpha + click, beta + (1 - click))
a, b = 1.0, 1.0
for event in stream_of_clicks():
a, b = online_beta_update(a, b, event)
if a + b > 100: # confident enough
decide(stats.beta(a, b).mean())
```
### PyMC 5 hierarchical
```python
import pymc as pm
import numpy as np
variants = ["A", "B", "C"]
clicks = np.array([73, 91, 82])
impressions = np.array([1000, 1010, 990])
with pm.Model() as model:
mu = pm.Beta("mu", 1, 1)
kappa = pm.HalfNormal("kappa", 10)
theta = pm.Beta("theta", mu * kappa, (1 - mu) * kappa, shape=len(variants))
pm.Binomial("y", n=impressions, p=theta, observed=clicks)
idata = pm.sample(2000, tune=1000, target_accept=0.95)
pm.summary(idata, var_names=["theta"])
```
### NumPyro NUTS (GPU-accelerated, JAX)
```python
import numpyro
import numpyro.distributions as dist
from numpyro.infer import MCMC, NUTS
import jax.numpy as jnp
def model(impressions, clicks=None):
p = numpyro.sample("p", dist.Beta(1, 1))
numpyro.sample("obs", dist.Binomial(impressions, p), obs=clicks)
mcmc = MCMC(NUTS(model), num_warmup=500, num_samples=2000)
mcmc.run(jax.random.PRNGKey(0), impressions=jnp.array(1000), clicks=jnp.array(73))
mcmc.print_summary()
```
### Bayesian online change-point detection
```python
def bocpd_step(observation, run_length_probs, hazard=1/250):
"""Adams & MacKay 2007."""
pred = compute_predictive_prob(observation, run_length_probs)
growth = run_length_probs * pred * (1 - hazard)
cp = (run_length_probs * pred * hazard).sum()
new = np.concatenate([[cp], growth])
return new / new.sum()
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| 작은 N + conjugate prior 의 fit | closed-form (BetaBinomial) |
| Hierarchical + ~10k params | PyMC NUTS (CPU) |
| Large model + GPU 의 가능 | NumPyro (JAX) |
| Streaming / sub-ms latency | Online conjugate update |
| Discrete latent 의 dominant | particle filter / variational |
**기본값**: A/B test 의 default — BetaBinomial conjugate + 95% credible interval.
## 🔗 Graph
- 부모: [[Bayes-Theorem]]
- 변형: [[Belief-Revision]] · [[Inference-Coupled Persistence]]
- 응용: [[Item-Item-Collaborative-Filtering]] · [[Statistical-Analysis]]
- Adjacent: [[몬테카를로 시뮬레이션]] · [[Multi-agent-System]]
## 🤖 LLM 활용
**언제**: A/B early-stopping decision, sensor fusion, parameter uncertainty 의 explicit propagation.
**언제 X**: data 의 abundant + flat likelihood 의 dominant 인 경우 — frequentist MLE 의 sufficient.
## ❌ 안티패턴
- **Improper prior 의 use**: posterior 의 not normalize 의 가능 — proper prior 의 verify.
- **Prior 의 sneaking strong assumption**: subjective prior 의 sensitivity analysis 의 필수.
- **Peeking 의 misinterpretation**: Bayesian posterior 의 frequentist p-value 의 X — separate calibration.
- **MCMC convergence 의 무시**: R-hat > 1.01, ESS < 400 의 즉시 의 reject.
## 🧪 검증 / 중복
- Verified (Gelman et al. *Bayesian Data Analysis* 3rd, McElreath *Statistical Rethinking* 2nd).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — full Bayesian updating with PyMC 5, NumPyro, online BOCPD |