---
id: wiki-2026-0508-statistical-analysis
title: Statistical Analysis
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Statistics, Inferential Statistics, Data Analysis]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [statistics, hypothesis-testing, regression, bayesian]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: Python / R
  framework: scipy / statsmodels / pymc / R-tidyverse
---

# Statistical Analysis

## 매 한 줄
> **"매 데이터의 uncertainty 를 정량화"**. Fisher–Neyman frequentist framework 부터 Gelman 2020s Bayesian workflow까지, 2026 현재 표준은 statsmodels + PyMC 5.x + ArviZ pipeline 으로 reproducible inference를 빌드하는 것이다.

## 매 핵심

### 매 두 paradigm
- **Frequentist**: parameter 는 fixed, data 가 random. p-value, confidence interval, MLE.
- **Bayesian**: parameter 도 random, prior + likelihood → posterior. Credible interval, posterior predictive.
- **2026 합의**: 매 둘 다 도구 — small n / strong prior 면 Bayesian, large n / regulated 면 frequentist.

### 매 핵심 절차
- **EDA**: distribution, missing, outlier, correlation matrix.
- **Hypothesis test**: t-test, χ², Mann-Whitney, permutation. Effect size + CI 동봉.
- **Regression**: OLS → GLM → mixed-effects → hierarchical Bayesian.
- **Model checking**: residual diagnostics, posterior predictive checks, k-fold CV.

### 매 응용
1. A/B test 분석 (web, ML model rollout).
2. Clinical trial efficacy.
3. Causal inference (DiD, IV, RDD, double ML).
4. Risk modeling (insurance, finance).

## 💻 패턴

### Welch's t-test + effect size + CI (scipy 1.13+)
```python
import numpy as np
from scipy import stats

def welch_with_effect(a, b):
    t, p = stats.ttest_ind(a, b, equal_var=False)
    n1, n2 = len(a), len(b)
    s1, s2 = a.var(ddof=1), b.var(ddof=1)
    pooled = np.sqrt(((n1-1)*s1 + (n2-1)*s2) / (n1+n2-2))
    cohen_d = (a.mean() - b.mean()) / pooled
    df = (s1/n1 + s2/n2)**2 / ((s1/n1)**2/(n1-1) + (s2/n2)**2/(n2-1))
    se = np.sqrt(s1/n1 + s2/n2)
    crit = stats.t.ppf(0.975, df)
    diff = a.mean() - b.mean()
    return dict(t=t, p=p, d=cohen_d, ci=(diff - crit*se, diff + crit*se))
```

### OLS regression with diagnostics (statsmodels)
```python
import statsmodels.api as sm
import statsmodels.formula.api as smf

model = smf.ols("y ~ x1 + x2 + C(group)", data=df).fit(cov_type="HC3")
print(model.summary())

# diagnostics
from statsmodels.stats.diagnostic import het_breuschpagan
bp = het_breuschpagan(model.resid, model.model.exog)
print("Breusch-Pagan p:", bp[1])
```

### Hierarchical Bayesian (PyMC 5.x)
```python
import pymc as pm
import arviz as az

with pm.Model() as hier:
    mu_a = pm.Normal("mu_a", 0, 5)
    sigma_a = pm.HalfNormal("sigma_a", 1)
    a = pm.Normal("a", mu_a, sigma_a, shape=n_groups)
    b = pm.Normal("b", 0, 1)
    sigma = pm.HalfNormal("sigma", 1)
    mu = a[group_idx] + b * x
    pm.Normal("y_obs", mu, sigma, observed=y)
    idata = pm.sample(2000, tune=1000, target_accept=0.95)

az.plot_trace(idata)
az.summary(idata, var_names=["mu_a", "sigma_a", "b"])
```

### Bootstrap CI
```python
import numpy as np
def bootstrap_ci(data, stat=np.mean, n=10_000, alpha=0.05, rng=None):
    rng = rng or np.random.default_rng(42)
    boots = stat(rng.choice(data, size=(n, len(data)), replace=True), axis=1)
    lo, hi = np.quantile(boots, [alpha/2, 1-alpha/2])
    return stat(data), (lo, hi)
```

### Multiple testing correction
```python
from statsmodels.stats.multitest import multipletests
reject, pvals_corr, _, _ = multipletests(pvals, alpha=0.05, method="fdr_bh")
```

### Causal inference: doubly robust (EconML / DoubleML)
```python
from econml.dml import LinearDML
from sklearn.ensemble import GradientBoostingRegressor

dml = LinearDML(
    model_y=GradientBoostingRegressor(),
    model_t=GradientBoostingRegressor(),
    discrete_treatment=False,
    cv=5,
)
dml.fit(Y, T, X=X, W=W)
print(dml.effect(X), dml.effect_interval(X))
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| 2-group mean compare, normal-ish | Welch's t-test |
| Non-parametric, small n | Mann-Whitney / permutation |
| Multi-level data | Mixed-effects (lme4 / statsmodels) |
| Strong prior, small n | Bayesian (PyMC) |
| Causal effect from observational | DML / IV / RDD |
| Many comparisons | FDR (BH), not Bonferroni unless ≤10 tests |

**기본값**: statsmodels for frequentist, PyMC 5 + ArviZ for Bayesian, EconML for causal.

## 🔗 Graph
- 부모: [[Probability Theory]]
- 변형: [[Bayesian_Inference|Bayesian Inference]] · [[Causal Inference]]
- Adjacent: [[Machine Learning]] · [[Power Analysis]]

## 🤖 LLM 활용
**언제**: pipeline scaffolding, EDA narrative, model spec translation, plot 코드 생성.
**언제 X**: numerical p-value computation 직접 — library 사용. 매 LLM의 hallucinated stat 의 X.

## ❌ 안티패턴
- **p-hacking**: 매 multiple test 후 cherry-pick — pre-registration + correction 필수.
- **CI vs PI 혼동**: confidence interval ≠ prediction interval. 매 명확히 구분.
- **HARKing**: hypothesis after results — exploratory vs confirmatory 분리.
- **Naive default prior**: PyMC `Normal(0, 100)` 의 X — domain-informed weakly-informative prior.
- **n=30 rule**: 매 myth — distribution shape 기반 결정.

## 🧪 검증 / 중복
- Verified (Wasserman "All of Statistics", Gelman BDA3, statsmodels docs 0.14+, PyMC 5.x docs).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — frequentist + Bayesian + causal patterns |