Files
2nd/10_Wiki/Topics/AI_and_ML/Bayesian Statistics.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

7.2 KiB
Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-bayesian-statistics Bayesian Statistics 10_Wiki/Topics verified self
베이지안 통계
Bayes' theorem
posterior
prior
MCMC
variational inference
PyMC
Stan
none A 0.95 applied
bayesian
statistics
mcmc
variational-inference
pymc
stan
probabilistic-programming
uncertainty
2026-05-10 pending
language framework
Python PyMC / Stan / NumPyro / Pyro

Bayesian Statistics

📌 한 줄 통찰

"매 probability = 매 belief 의 degree". 매 frequency X — 매 prior + data → posterior 의 update. 매 small data + prior knowledge 의 strong. 매 result = 매 distribution (not point). 매 modern compute (MCMC / VI) 의 mainstream.

📖 핵심

Bayes' theorem

P(\theta | D) = \frac{P(D | \theta) \cdot P(\theta)}{P(D)}
  • P(θ): prior — 매 belief.
  • P(D | θ): likelihood — 매 data 의 model.
  • P(θ | D): posterior — 매 update 된 belief.
  • P(D): evidence (normalizer).

vs Frequentist

측면 Frequentist Bayesian
Probability 매 long-run frequency 매 belief degree
Parameter 매 fixed unknown 매 random variable
Result 매 point + CI 매 posterior distribution
Small data 매 fragile 매 prior 의 robust
Compute 매 cheap 매 expensive (until MCMC)
Interpretation "95% of intervals contain θ" "P(θ ∈ [a,b]) = 0.95"

매 conjugate prior (analytical)

Likelihood Prior Posterior
Binomial Beta Beta
Poisson Gamma Gamma
Normal (known σ) Normal Normal
Normal (unknown μ,σ) Normal-Gamma Normal-Gamma
Multinomial Dirichlet Dirichlet

→ 매 closed-form 가, 매 limited.

매 inference (modern)

MCMC (Markov Chain Monte Carlo)

  • Metropolis-Hastings: 매 random walk + accept/reject.
  • Hamiltonian MC (HMC): 매 gradient 활용.
  • NUTS (No-U-Turn): 매 HMC 의 auto-tune.
  • 매 정확. 매 slow.

Variational Inference (VI)

  • 매 posterior 의 approximate distribution q(θ) 의 fit.
  • 매 KL divergence 의 minimize.
  • 매 fast + scale. 매 approximate.

Sequential Monte Carlo

  • 매 particle filter.
  • 매 streaming OK.

매 응용

  1. A/B testing: 매 frequentist 보다 매 interpretable.
  2. Hyperparameter tuning (Bayesian Optimization): 매 GP + acquisition.
  3. Hierarchical models: 매 group-level prior.
  4. Time series (state-space): 매 Kalman, 매 particle filter.
  5. Causal inference (Bayesian network): 매 DAG.
  6. Drug discovery / clinical: 매 small N + strong prior.
  7. Robotics (SLAM): 매 pose + map 의 joint.
  8. Topic modeling (LDA): 매 Dirichlet prior.

매 modern stack

  • Stan: 매 NUTS, 매 mature.
  • PyMC (3 → 4 → 5): 매 Python + Aesara.
  • NumPyro: 매 JAX-based, 매 fast.
  • Pyro: 매 PyTorch + VI.
  • TFP: 매 TensorFlow Probability.
  • Edward2 / blackjax: 매 modular.

💻 패턴

Coin flip (PyMC)

import pymc as pm
import numpy as np

# 매 data: 매 8 head, 매 2 tail
data = np.array([1]*8 + [0]*2)

with pm.Model() as model:
    p = pm.Beta('p', alpha=2, beta=2)  # 매 prior
    obs = pm.Bernoulli('obs', p=p, observed=data)
    trace = pm.sample(2000, return_inferencedata=True)

# 매 posterior
import arviz as az
az.plot_posterior(trace)
print(az.summary(trace))
# p mean ≈ 0.71, hdi_3% ≈ 0.50, hdi_97% ≈ 0.89

Hierarchical (group-level)

with pm.Model() as h:
    # 매 hyperprior
    mu = pm.Normal('mu', 0, 10)
    sigma = pm.HalfNormal('sigma', 5)
    
    # 매 group-level
    theta = pm.Normal('theta', mu, sigma, shape=n_groups)
    
    # 매 likelihood
    y = pm.Normal('y', theta[group_idx], 1, observed=data)
    
    trace = pm.sample(2000)

→ 매 partial pooling — 매 group 의 small N 의 borrow strength.

Bayesian A/B test

with pm.Model() as ab:
    p_a = pm.Beta('p_a', 1, 1)
    p_b = pm.Beta('p_b', 1, 1)
    
    obs_a = pm.Binomial('obs_a', n=n_a, p=p_a, observed=conv_a)
    obs_b = pm.Binomial('obs_b', n=n_b, p=p_b, observed=conv_b)
    
    diff = pm.Deterministic('diff', p_b - p_a)
    
    trace = pm.sample(2000)

# 매 P(B > A)
prob_b_better = (trace.posterior['diff'] > 0).mean().item()
print(f'P(B > A) = {prob_b_better:.3f}')

→ 매 frequentist 보다 매 actionable.

Variational inference (faster)

import numpyro
import numpyro.distributions as dist
from numpyro.infer import SVI, Trace_ELBO
from numpyro.infer.autoguide import AutoNormal

def model(data):
    p = numpyro.sample('p', dist.Beta(2, 2))
    numpyro.sample('obs', dist.Bernoulli(p), obs=data)

guide = AutoNormal(model)
svi = SVI(model, guide, optim.Adam(0.01), Trace_ELBO())
state = svi.init(jax.random.PRNGKey(0), data)
for step in range(2000):
    state, loss = svi.update(state, data)

Bayesian Optimization (hyperparameter)

from skopt import gp_minimize
from skopt.space import Real, Integer

def objective(params):
    lr, depth = params
    return train_and_eval(lr, depth)  # 매 minimize

result = gp_minimize(
    objective,
    [Real(1e-5, 1e-1, prior='log-uniform', name='lr'),
     Integer(1, 10, name='depth')],
    n_calls=50,
)

Posterior predictive check

with model:
    ppc = pm.sample_posterior_predictive(trace)

# 매 simulated data 의 actual 의 비교 — 매 model fit 의 visual.
az.plot_ppc(az.from_pymc3(posterior_predictive=ppc, model=model))

🤔 결정 기준

상황 Method
Small data + prior Conjugate (analytical)
Complex model + accuracy NUTS (PyMC / Stan)
Large data + speed VI (Pyro / NumPyro)
Streaming Particle filter
Hyperparameter tune BO (skopt / Optuna)
A/B test Beta-Binomial + Bayes
Topic modeling LDA
Causal Bayesian network

기본값: PyMC + NUTS 의 baseline. 매 scale 가 NumPyro / VI.

🔗 Graph

🤖 LLM 활용

언제: 매 small data + prior. 매 uncertainty quantify. 매 hierarchical structure. 매 hyperparameter tune. 언제 X: 매 large data + speed > accuracy. 매 simple frequentist 의 OK.

안티패턴

  • Improper prior: 매 posterior 의 invalid.
  • No PPC: 매 fit 의 모름.
  • MCMC 의 chains 1: 매 convergence 의 detect X.
  • Burn-in 무시: 매 biased estimate.
  • Conjugate 의 force: 매 wrong likelihood.
  • VI 의 over-confident (mean-field): 매 underestimate uncertainty.
  • R-hat ignore: 매 non-convergence.

🧪 검증 / 중복

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — Bayes formula + MCMC / VI + 매 PyMC / NumPyro / skopt code