d8a80f6272
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해 끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은 과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업. 도구: Datacollect/scripts/link_reconcile_apply.mjs Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7.2 KiB
7.2 KiB
id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
| id | title | category | status | canonical_id | aliases | duplicate_of | source_trust_level | confidence_score | verification_status | tags | raw_sources | last_reinforced | github_commit | tech_stack | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wiki-2026-0508-bayesian-statistics | Bayesian Statistics | 10_Wiki/Topics | verified | self |
|
none | A | 0.95 | applied |
|
2026-05-10 | pending |
|
Bayesian Statistics
📌 한 줄 통찰
"매 probability = 매 belief 의 degree". 매 frequency X — 매 prior + data → posterior 의 update. 매 small data + prior knowledge 의 strong. 매 result = 매 distribution (not point). 매 modern compute (MCMC / VI) 의 mainstream.
📖 핵심
Bayes' theorem
P(\theta | D) = \frac{P(D | \theta) \cdot P(\theta)}{P(D)}
- P(θ): prior — 매 belief.
- P(D | θ): likelihood — 매 data 의 model.
- P(θ | D): posterior — 매 update 된 belief.
- P(D): evidence (normalizer).
vs Frequentist
| 측면 | Frequentist | Bayesian |
|---|---|---|
| Probability | 매 long-run frequency | 매 belief degree |
| Parameter | 매 fixed unknown | 매 random variable |
| Result | 매 point + CI | 매 posterior distribution |
| Small data | 매 fragile | 매 prior 의 robust |
| Compute | 매 cheap | 매 expensive (until MCMC) |
| Interpretation | "95% of intervals contain θ" | "P(θ ∈ [a,b]) = 0.95" |
매 conjugate prior (analytical)
| Likelihood | Prior | Posterior |
|---|---|---|
| Binomial | Beta | Beta |
| Poisson | Gamma | Gamma |
| Normal (known σ) | Normal | Normal |
| Normal (unknown μ,σ) | Normal-Gamma | Normal-Gamma |
| Multinomial | Dirichlet | Dirichlet |
→ 매 closed-form 가, 매 limited.
매 inference (modern)
MCMC (Markov Chain Monte Carlo)
- Metropolis-Hastings: 매 random walk + accept/reject.
- Hamiltonian MC (HMC): 매 gradient 활용.
- NUTS (No-U-Turn): 매 HMC 의 auto-tune.
- ✅ 매 정확. ❌ 매 slow.
Variational Inference (VI)
- 매 posterior 의 approximate distribution q(θ) 의 fit.
- 매 KL divergence 의 minimize.
- ✅ 매 fast + scale. ❌ 매 approximate.
Sequential Monte Carlo
- 매 particle filter.
- 매 streaming OK.
매 응용
- A/B testing: 매 frequentist 보다 매 interpretable.
- Hyperparameter tuning (Bayesian Optimization): 매 GP + acquisition.
- Hierarchical models: 매 group-level prior.
- Time series (state-space): 매 Kalman, 매 particle filter.
- Causal inference (Bayesian network): 매 DAG.
- Drug discovery / clinical: 매 small N + strong prior.
- Robotics (SLAM): 매 pose + map 의 joint.
- Topic modeling (LDA): 매 Dirichlet prior.
매 modern stack
- Stan: 매 NUTS, 매 mature.
- PyMC (3 → 4 → 5): 매 Python + Aesara.
- NumPyro: 매 JAX-based, 매 fast.
- Pyro: 매 PyTorch + VI.
- TFP: 매 TensorFlow Probability.
- Edward2 / blackjax: 매 modular.
💻 패턴
Coin flip (PyMC)
import pymc as pm
import numpy as np
# 매 data: 매 8 head, 매 2 tail
data = np.array([1]*8 + [0]*2)
with pm.Model() as model:
p = pm.Beta('p', alpha=2, beta=2) # 매 prior
obs = pm.Bernoulli('obs', p=p, observed=data)
trace = pm.sample(2000, return_inferencedata=True)
# 매 posterior
import arviz as az
az.plot_posterior(trace)
print(az.summary(trace))
# p mean ≈ 0.71, hdi_3% ≈ 0.50, hdi_97% ≈ 0.89
Hierarchical (group-level)
with pm.Model() as h:
# 매 hyperprior
mu = pm.Normal('mu', 0, 10)
sigma = pm.HalfNormal('sigma', 5)
# 매 group-level
theta = pm.Normal('theta', mu, sigma, shape=n_groups)
# 매 likelihood
y = pm.Normal('y', theta[group_idx], 1, observed=data)
trace = pm.sample(2000)
→ 매 partial pooling — 매 group 의 small N 의 borrow strength.
Bayesian A/B test
with pm.Model() as ab:
p_a = pm.Beta('p_a', 1, 1)
p_b = pm.Beta('p_b', 1, 1)
obs_a = pm.Binomial('obs_a', n=n_a, p=p_a, observed=conv_a)
obs_b = pm.Binomial('obs_b', n=n_b, p=p_b, observed=conv_b)
diff = pm.Deterministic('diff', p_b - p_a)
trace = pm.sample(2000)
# 매 P(B > A)
prob_b_better = (trace.posterior['diff'] > 0).mean().item()
print(f'P(B > A) = {prob_b_better:.3f}')
→ 매 frequentist 보다 매 actionable.
Variational inference (faster)
import numpyro
import numpyro.distributions as dist
from numpyro.infer import SVI, Trace_ELBO
from numpyro.infer.autoguide import AutoNormal
def model(data):
p = numpyro.sample('p', dist.Beta(2, 2))
numpyro.sample('obs', dist.Bernoulli(p), obs=data)
guide = AutoNormal(model)
svi = SVI(model, guide, optim.Adam(0.01), Trace_ELBO())
state = svi.init(jax.random.PRNGKey(0), data)
for step in range(2000):
state, loss = svi.update(state, data)
Bayesian Optimization (hyperparameter)
from skopt import gp_minimize
from skopt.space import Real, Integer
def objective(params):
lr, depth = params
return train_and_eval(lr, depth) # 매 minimize
result = gp_minimize(
objective,
[Real(1e-5, 1e-1, prior='log-uniform', name='lr'),
Integer(1, 10, name='depth')],
n_calls=50,
)
Posterior predictive check
with model:
ppc = pm.sample_posterior_predictive(trace)
# 매 simulated data 의 actual 의 비교 — 매 model fit 의 visual.
az.plot_ppc(az.from_pymc3(posterior_predictive=ppc, model=model))
🤔 결정 기준
| 상황 | Method |
|---|---|
| Small data + prior | Conjugate (analytical) |
| Complex model + accuracy | NUTS (PyMC / Stan) |
| Large data + speed | VI (Pyro / NumPyro) |
| Streaming | Particle filter |
| Hyperparameter tune | BO (skopt / Optuna) |
| A/B test | Beta-Binomial + Bayes |
| Topic modeling | LDA |
| Causal | Bayesian network |
기본값: PyMC + NUTS 의 baseline. 매 scale 가 NumPyro / VI.
🔗 Graph
- 부모: Statistics · Probability Theory
- 변형: MCMC · Variational-Inference · Bayesian-Network
- 응용: Bayesian-Optimization · LDA · SLAM
- Tool: PyMC · Stan
- Adjacent: Bayes-Theorem · Bayesian-Updating
🤖 LLM 활용
언제: 매 small data + prior. 매 uncertainty quantify. 매 hierarchical structure. 매 hyperparameter tune. 언제 X: 매 large data + speed > accuracy. 매 simple frequentist 의 OK.
❌ 안티패턴
- Improper prior: 매 posterior 의 invalid.
- No PPC: 매 fit 의 모름.
- MCMC 의 chains 1: 매 convergence 의 detect X.
- Burn-in 무시: 매 biased estimate.
- Conjugate 의 force: 매 wrong likelihood.
- VI 의 over-confident (mean-field): 매 underestimate uncertainty.
- R-hat ignore: 매 non-convergence.
🧪 검증 / 중복
- Verified (Gelman BDA, McElreath Statistical Rethinking, Stan/PyMC docs).
- 신뢰도 A.
- Related: Bayes-Theorem · MCMC · Bayesian-Optimization · Variational-Inference.
🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — Bayes formula + MCMC / VI + 매 PyMC / NumPyro / skopt code |