Files
2nd/10_Wiki/Topics/Architecture/Bayesian_Inference.md
T
2026-05-10 22:08:15 +09:00

6.3 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-bayesian-inference Bayesian Inference 10_Wiki/Topics verified self
Bayesian Inference
Bayesian Statistics
Posterior Inference
none A 0.95 applied
statistics
ml
probabilistic
mcmc
2026-05-10 pending
language framework
python pymc,numpyro,stan

Bayesian Inference

매 한 줄

"매 prior + likelihood = posterior — 매 belief 의 evidence 에 의해 의 update". Bayes 1763 의 origin, 20세기 frequentist 의 dominance, 2026 의 NumPyro/PyMC + GPU MCMC + variational inference 의 mainstream — 매 LLM uncertainty quantification 의 backbone.

매 핵심

매 Bayes rule

P(θ|D) = P(D|θ) P(θ) / P(D)

  • P(θ): prior — 매 data 이전 의 belief.
  • P(D|θ): likelihood — 매 model 의 data fit.
  • P(θ|D): posterior — updated belief.
  • P(D): evidence (marginal likelihood).

매 4 inference 방법

  • Conjugate: closed-form (Beta-Bernoulli · Gaussian-Gaussian).
  • MCMC: HMC · NUTS · Gibbs — exact (asymptotic) · slow.
  • Variational (VI): posterior 의 simpler family 의 approximate — fast · biased.
  • Sequential MC: particle filter — 매 streaming · state-space.

매 응용

  1. A/B test (Bayesian alternative 의 frequentist).
  2. Hierarchical model (분류 multi-level).
  3. ML calibration · uncertainty (BNN · Gaussian process).
  4. LLM logit calibration · RAG confidence.

💻 패턴

NumPyro: Bayesian linear regression (NUTS)

import numpyro
import numpyro.distributions as dist
from numpyro.infer import NUTS, MCMC
import jax.numpy as jnp
import jax.random as random

def model(X, y=None):
    alpha = numpyro.sample("alpha", dist.Normal(0., 10.))
    beta  = numpyro.sample("beta",  dist.Normal(jnp.zeros(X.shape[1]), 1.))
    sigma = numpyro.sample("sigma", dist.HalfNormal(1.))
    mu    = alpha + X @ beta
    numpyro.sample("obs", dist.Normal(mu, sigma), obs=y)

mcmc = MCMC(NUTS(model), num_warmup=1000, num_samples=2000, num_chains=4)
mcmc.run(random.PRNGKey(0), X, y)
mcmc.print_summary()

PyMC: Bayesian A/B test (Beta-Bernoulli)

import pymc as pm

with pm.Model():
    p_a = pm.Beta("p_a", alpha=1, beta=1)
    p_b = pm.Beta("p_b", alpha=1, beta=1)
    pm.Binomial("obs_a", n=n_a, p=p_a, observed=conv_a)
    pm.Binomial("obs_b", n=n_b, p=p_b, observed=conv_b)
    diff = pm.Deterministic("lift", p_b - p_a)
    idata = pm.sample(2000, tune=1000, chains=4)

prob_b_better = (idata.posterior["lift"] > 0).mean().item()

Hierarchical model (varying intercept)

def hierarchical(X, group_idx, y=None, n_groups=10):
    mu_a    = numpyro.sample("mu_a",    dist.Normal(0., 5.))
    sigma_a = numpyro.sample("sigma_a", dist.HalfNormal(1.))
    a       = numpyro.sample("a", dist.Normal(mu_a, sigma_a).expand([n_groups]))
    beta    = numpyro.sample("beta", dist.Normal(0., 1.))
    sigma   = numpyro.sample("sigma", dist.HalfNormal(1.))
    mu      = a[group_idx] + beta * X
    numpyro.sample("obs", dist.Normal(mu, sigma), obs=y)

Variational inference (SVI)

from numpyro.infer import SVI, Trace_ELBO
from numpyro.infer.autoguide import AutoNormal

guide = AutoNormal(model)
svi   = SVI(model, guide, numpyro.optim.Adam(1e-3), Trace_ELBO())
state = svi.init(random.PRNGKey(0), X, y)

for i in range(2000):
    state, loss = svi.update(state, X, y)
params = svi.get_params(state)

Conjugate update (Beta-Bernoulli online)

class BetaBernoulli:
    def __init__(self, alpha=1, beta=1):
        self.alpha, self.beta = alpha, beta

    def update(self, success: bool):
        self.alpha += int(success)
        self.beta  += int(not success)

    def mean(self):  return self.alpha / (self.alpha + self.beta)
    def credible_interval(self, q=0.95):
        from scipy.stats import beta
        return beta.interval(q, self.alpha, self.beta)

Bayesian neural net (Pyro Bayesian layer)

import torch
import pyro
import pyro.nn as pnn

class BNN(pnn.PyroModule):
    def __init__(self, in_d, out_d):
        super().__init__()
        self.linear = pnn.PyroModule[torch.nn.Linear](in_d, out_d)
        self.linear.weight = pnn.PyroSample(dist.Normal(0., 1.).expand([out_d, in_d]).to_event(2))
        self.linear.bias   = pnn.PyroSample(dist.Normal(0., 1.).expand([out_d]).to_event(1))

    def forward(self, x, y=None):
        mean = self.linear(x).squeeze(-1)
        with pyro.plate("data", x.shape[0]):
            pyro.sample("obs", dist.Normal(mean, 0.1), obs=y)
        return mean

매 결정 기준

상황 Method
Conjugate model · streaming Conjugate update
매 small model · accurate posterior NUTS/HMC
Large data · fast approx SVI · ADVI
State-space · time-series Particle filter
Deep model · scale BNN + variational · MC dropout
매 hyperparameter optimization Gaussian process + acquisition

기본값: NumPyro + NUTS — 매 GPU/JAX 의 fast.

🔗 Graph

🤖 LLM 활용

언제: model 의 prior · likelihood 의 spec 의 draft, posterior plot 의 interpret, NumPyro/PyMC code 의 generate. 언제 X: 매 convergence diagnostic (R-hat · ESS · trace plot) — LLM 의 statistical judgment 의 unreliable, statistician 의 review 의 require.

안티패턴

  • Flat prior 의 always: weak data + flat prior → unstable posterior. Weakly informative prior 의 use.
  • No convergence check: R-hat > 1.01 · ESS < 400 → posterior 의 invalid.
  • Single chain MCMC: multi-chain 의 mix check 의 mandatory.
  • Posterior point-estimate 의 only: 매 distribution 의 entirety 의 use — credible interval · posterior predictive.

🧪 검증 / 중복

  • Verified (Gelman Bayesian Data Analysis 3rd, NumPyro/PyMC/Stan docs, McElreath Statistical Rethinking 2nd).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — Bayes rule + 4 methods + NumPyro/PyMC patterns