--- id: wiki-2026-0508-bayesian-inference title: Bayesian Inference category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Bayesian Inference, Bayesian Statistics, Posterior Inference] duplicate_of: none source_trust_level: A confidence_score: 0.95 verification_status: applied tags: [statistics, ml, probabilistic, mcmc] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: pymc,numpyro,stan --- # Bayesian Inference ## 매 한 줄 > **"매 prior + likelihood = posterior — 매 belief 의 evidence 에 의해 의 update"**. Bayes 1763 의 origin, 20세기 frequentist 의 dominance, 2026 의 NumPyro/PyMC + GPU MCMC + variational inference 의 mainstream — 매 LLM uncertainty quantification 의 backbone. ## 매 핵심 ### 매 Bayes rule P(θ|D) = P(D|θ) P(θ) / P(D) - **P(θ)**: prior — 매 data 이전 의 belief. - **P(D|θ)**: likelihood — 매 model 의 data fit. - **P(θ|D)**: posterior — updated belief. - **P(D)**: evidence (marginal likelihood). ### 매 4 inference 방법 - **Conjugate**: closed-form (Beta-Bernoulli · Gaussian-Gaussian). - **MCMC**: HMC · NUTS · Gibbs — exact (asymptotic) · slow. - **Variational (VI)**: posterior 의 simpler family 의 approximate — fast · biased. - **Sequential MC**: particle filter — 매 streaming · state-space. ### 매 응용 1. A/B test (Bayesian alternative 의 frequentist). 2. Hierarchical model (분류 multi-level). 3. ML calibration · uncertainty (BNN · Gaussian process). 4. LLM logit calibration · RAG confidence. ## 💻 패턴 ### NumPyro: Bayesian linear regression (NUTS) ```python import numpyro import numpyro.distributions as dist from numpyro.infer import NUTS, MCMC import jax.numpy as jnp import jax.random as random def model(X, y=None): alpha = numpyro.sample("alpha", dist.Normal(0., 10.)) beta = numpyro.sample("beta", dist.Normal(jnp.zeros(X.shape[1]), 1.)) sigma = numpyro.sample("sigma", dist.HalfNormal(1.)) mu = alpha + X @ beta numpyro.sample("obs", dist.Normal(mu, sigma), obs=y) mcmc = MCMC(NUTS(model), num_warmup=1000, num_samples=2000, num_chains=4) mcmc.run(random.PRNGKey(0), X, y) mcmc.print_summary() ``` ### PyMC: Bayesian A/B test (Beta-Bernoulli) ```python import pymc as pm with pm.Model(): p_a = pm.Beta("p_a", alpha=1, beta=1) p_b = pm.Beta("p_b", alpha=1, beta=1) pm.Binomial("obs_a", n=n_a, p=p_a, observed=conv_a) pm.Binomial("obs_b", n=n_b, p=p_b, observed=conv_b) diff = pm.Deterministic("lift", p_b - p_a) idata = pm.sample(2000, tune=1000, chains=4) prob_b_better = (idata.posterior["lift"] > 0).mean().item() ``` ### Hierarchical model (varying intercept) ```python def hierarchical(X, group_idx, y=None, n_groups=10): mu_a = numpyro.sample("mu_a", dist.Normal(0., 5.)) sigma_a = numpyro.sample("sigma_a", dist.HalfNormal(1.)) a = numpyro.sample("a", dist.Normal(mu_a, sigma_a).expand([n_groups])) beta = numpyro.sample("beta", dist.Normal(0., 1.)) sigma = numpyro.sample("sigma", dist.HalfNormal(1.)) mu = a[group_idx] + beta * X numpyro.sample("obs", dist.Normal(mu, sigma), obs=y) ``` ### Variational inference (SVI) ```python from numpyro.infer import SVI, Trace_ELBO from numpyro.infer.autoguide import AutoNormal guide = AutoNormal(model) svi = SVI(model, guide, numpyro.optim.Adam(1e-3), Trace_ELBO()) state = svi.init(random.PRNGKey(0), X, y) for i in range(2000): state, loss = svi.update(state, X, y) params = svi.get_params(state) ``` ### Conjugate update (Beta-Bernoulli online) ```python class BetaBernoulli: def __init__(self, alpha=1, beta=1): self.alpha, self.beta = alpha, beta def update(self, success: bool): self.alpha += int(success) self.beta += int(not success) def mean(self): return self.alpha / (self.alpha + self.beta) def credible_interval(self, q=0.95): from scipy.stats import beta return beta.interval(q, self.alpha, self.beta) ``` ### Bayesian neural net (Pyro Bayesian layer) ```python import torch import pyro import pyro.nn as pnn class BNN(pnn.PyroModule): def __init__(self, in_d, out_d): super().__init__() self.linear = pnn.PyroModule[torch.nn.Linear](in_d, out_d) self.linear.weight = pnn.PyroSample(dist.Normal(0., 1.).expand([out_d, in_d]).to_event(2)) self.linear.bias = pnn.PyroSample(dist.Normal(0., 1.).expand([out_d]).to_event(1)) def forward(self, x, y=None): mean = self.linear(x).squeeze(-1) with pyro.plate("data", x.shape[0]): pyro.sample("obs", dist.Normal(mean, 0.1), obs=y) return mean ``` ## 매 결정 기준 | 상황 | Method | |---|---| | Conjugate model · streaming | Conjugate update | | 매 small model · accurate posterior | NUTS/HMC | | Large data · fast approx | SVI · ADVI | | State-space · time-series | Particle filter | | Deep model · scale | BNN + variational · MC dropout | | 매 hyperparameter optimization | Gaussian process + acquisition | **기본값**: NumPyro + NUTS — 매 GPU/JAX 의 fast. ## 🔗 Graph - 부모: [[Probability Theory]] · [[Statistical Inference]] - 변형: [[MCMC]] · [[Variational Inference]] · [[Empirical Bayes]] - 응용: [[Belief-System]] · [[Bayesian A/B Test]] · [[Bayesian Neural Network]] - Adjacent: [[Frequentist Inference]] · [[Causal Inference]] ## 🤖 LLM 활용 **언제**: model 의 prior · likelihood 의 spec 의 draft, posterior plot 의 interpret, NumPyro/PyMC code 의 generate. **언제 X**: 매 convergence diagnostic (R-hat · ESS · trace plot) — LLM 의 statistical judgment 의 unreliable, statistician 의 review 의 require. ## ❌ 안티패턴 - **Flat prior 의 always**: weak data + flat prior → unstable posterior. Weakly informative prior 의 use. - **No convergence check**: R-hat > 1.01 · ESS < 400 → posterior 의 invalid. - **Single chain MCMC**: multi-chain 의 mix check 의 mandatory. - **Posterior point-estimate 의 only**: 매 distribution 의 entirety 의 use — credible interval · posterior predictive. ## 🧪 검증 / 중복 - Verified (Gelman *Bayesian Data Analysis* 3rd, NumPyro/PyMC/Stan docs, McElreath *Statistical Rethinking* 2nd). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — Bayes rule + 4 methods + NumPyro/PyMC patterns |