---
id: wiki-2026-0508-statistics-data-analysis
title: "Statistics & Data Analysis"
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [stats, data analysis, applied statistics]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [statistics, data-analysis, ab-testing, ml, observability]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: python
  framework: numpy-scipy-statsmodels-pymc
---

# Statistics & Data Analysis

## 매 한 줄
> **"매 data 의 lying 의 — 매 stats 의 catching"**. Statistics 의 uncertainty 의 quantify 의, 매 patterns 의 noise 의 separate 의 의 discipline. 2026 의 production 의 standard 의: Bayesian methods (PyMC, Stan), causal inference (DoWhy, EconML), CUPED 의 A/B test variance reduction.

## 매 핵심

### 매 핵심 dichotomy
- **Frequentist**: p-values, confidence intervals — 매 long-run frequency 의.
- **Bayesian**: posteriors, credible intervals — 매 belief update 의.
- **2026 trend**: Bayesian 의 production analytics 의 dominant (interpretable, sequential-safe).

### 매 must-know toolkit
- **Hypothesis tests**: t-test, Mann-Whitney, χ², Fisher exact.
- **Regression**: OLS, GLM (logistic, Poisson), mixed-effects.
- **Causal**: difference-in-differences, IV, RDD, synthetic control.
- **A/B**: CUPED, sequential testing (mSPRT), multi-armed bandits.

### 매 응용
1. Product A/B testing (CUPED + sequential).
2. SRE — anomaly detection on metrics.
3. SAST/SCA findings 의 risk scoring (Bayesian prior).

## 💻 패턴

### Welch t-test (A/B)
```python
import numpy as np
from scipy import stats
control = np.array([...])
treatment = np.array([...])
t, p = stats.ttest_ind(control, treatment, equal_var=False)
ci = stats.t.interval(0.95, len(control)+len(treatment)-2,
                      loc=treatment.mean()-control.mean(),
                      scale=stats.sem(np.concatenate([control, treatment])))
print(f"Δ={treatment.mean()-control.mean():.4f}, p={p:.4f}, 95%CI={ci}")
```

### CUPED variance reduction
```python
import numpy as np
def cuped_adjust(y_pre, y_post):
    theta = np.cov(y_pre, y_post)[0,1] / np.var(y_pre)
    return y_post - theta * (y_pre - y_pre.mean())
y_adj_c = cuped_adjust(pre_c, post_c)
y_adj_t = cuped_adjust(pre_t, post_t)
```

### Bayesian A/B (PyMC)
```python
import pymc as pm
with pm.Model() as m:
    p_a = pm.Beta('p_a', 1, 1)
    p_b = pm.Beta('p_b', 1, 1)
    pm.Binomial('obs_a', n=n_a, p=p_a, observed=k_a)
    pm.Binomial('obs_b', n=n_b, p=p_b, observed=k_b)
    pm.Deterministic('lift', (p_b - p_a) / p_a)
    idata = pm.sample(2000, tune=1000)
print(f"P(B>A) = {(idata.posterior['lift']>0).mean().item():.3f}")
```

### Sequential testing (mSPRT)
```python
import numpy as np
def msprt(x, y, sigma2_tau=0.01, alpha=0.05):
    n = min(len(x), len(y))
    delta = y[:n] - x[:n]
    s2 = delta.var(ddof=1)
    t = delta.mean() * np.sqrt(n)
    lr = np.sqrt(s2/(s2+n*sigma2_tau)) * np.exp(
        n*sigma2_tau*t**2 / (2*s2*(s2+n*sigma2_tau)))
    return lr > 1/alpha
```

### Causal — difference-in-differences (statsmodels)
```python
import statsmodels.formula.api as smf
m = smf.ols('y ~ treated * post + C(unit) + C(time)', data=df).fit(
    cov_type='cluster', cov_kwds={'groups': df['unit']})
print(m.params['treated:post'])
```

### Anomaly — robust z (MAD)
```python
import numpy as np
def mad_z(x):
    med = np.median(x)
    mad = np.median(np.abs(x - med))
    return 0.6745 * (x - med) / (mad + 1e-9)
anomalies = np.abs(mad_z(latency_p99)) > 3.5
```

## 매 결정 기준
| 상황 | Method |
|---|---|
| 2-arm online experiment, fixed N | Welch t-test + CUPED |
| sequential / peeking 위험 | mSPRT or Bayesian |
| many arms, exploration value | Thompson sampling bandit |
| observational, treatment effect | DiD / IV / synthetic control |
| heavy-tailed (revenue) | Mann-Whitney + bootstrap CI |

**기본값**: Welch + CUPED for online A/B; Bayesian for small-N or peeking; bootstrap for non-Gaussian.

## 🔗 Graph
- 부모: [[Probability Theory]]
- 변형: [[Bayesian Statistics]] · [[Causal Inference]]
- 응용: [[Anomaly Detection]] · [[ML Evaluation]]
- Adjacent: [[PyMC]]

## 🤖 LLM 활용
**언제**: experiment design review, p-value 해석, choosing test for distribution shape, generating PyMC models from descriptions.
**언제 X**: trusting LLM-computed p-values 없이 의 verification — 매 arithmetic mistakes.

## ❌ 안티패턴
- **Peeking**: 매 fixed-N test 의 daily check 의 stop — 매 false positive rate 의 5% → 30%+.
- **HARKing**: 매 hypothesis after results known.
- **p<0.05 worship**: 매 effect size 무시.
- **Ignoring multiple testing**: 매 20 metrics 의 →약 1 의 false positive 의 expected.
- **CUPED 의 covariate 의 post-treatment 의**: 매 invalidates.

## 🧪 검증 / 중복
- Verified (Microsoft CUPED paper 2013, Optimizely Stats Engine, Gelman BDA3, Wasserman All of Stats).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — A/B + Bayesian + causal patterns |