Files
2nd/10_Wiki/Topics/DevOps_and_Security/Inferential-Statistics.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

157 lines
5.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-inferential-statistics
title: Inferential Statistics
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Statistical Inference, Hypothesis Testing, Confidence Intervals]
duplicate_of: none
source_trust_level: A
confidence_score: 0.92
verification_status: applied
tags: [statistics, inference, hypothesis-testing, ab-testing, sre]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: python
framework: scipy
---
# Inferential Statistics
## 매 한 줄
> **"매 sample → population parameter 의 추정 + uncertainty 의 quantify"**. 매 1900s Fisher, Neyman, Pearson 의 frequentist framework, 매 2026 A/B test, SRE alerting, ML evaluation 의 backbone — Bayesian + bootstrap 의 modern hybrid 가 default.
## 매 핵심
### 매 Frequentist vs Bayesian
- **Frequentist**: parameter fixed, data random. p-value, CI.
- **Bayesian**: parameter random (prior), data fixed. Posterior, credible interval.
- **Bootstrap**: distribution-free, resample n→inf 시뮬레이션.
### 매 Test 분류
- **Parametric**: t-test, ANOVA, Z-test (assumes normal).
- **Non-parametric**: Mann-Whitney U, Kruskal-Wallis, permutation.
- **Sequential**: Always Valid Inference, mSPRT (peek-safe).
### 매 응용
1. A/B test: conversion lift 측정.
2. SRE: SLO breach 의 statistical significance.
3. ML: model A vs B 의 holdout 비교.
## 💻 패턴
### Two-sample t-test
```python
import scipy.stats as st
control = [12, 14, 11, 13, 12, 15, 13]
treat = [16, 18, 15, 17, 19, 16, 18]
res = st.ttest_ind(control, treat, equal_var=False)
print(f"t={res.statistic:.3f} p={res.pvalue:.4f}")
ci = res.confidence_interval(0.95)
print(f"95% CI: [{ci.low:.2f}, {ci.high:.2f}]")
```
### Bootstrap CI
```python
import numpy as np
def bootstrap_mean_ci(x, n=10_000, alpha=0.05):
rng = np.random.default_rng(42)
boots = rng.choice(x, size=(n, len(x)), replace=True).mean(axis=1)
return np.quantile(boots, [alpha/2, 1-alpha/2])
ci = bootstrap_mean_ci(np.array(control))
print(f"Bootstrap 95% CI: {ci}")
```
### Sample size calculation (power)
```python
from statsmodels.stats.power import TTestIndPower
analysis = TTestIndPower()
n = analysis.solve_power(effect_size=0.3, power=0.8, alpha=0.05)
print(f"매 group 당 n = {int(np.ceil(n))}")
```
### Sequential test (mSPRT, peek-safe)
```python
import numpy as np
def msprt_log_likelihood(x, mu0=0, sigma=1, theta=0.1):
n = len(x); xbar = np.mean(x); v = sigma**2
tau2 = theta**2
log_bf = 0.5*np.log(v/(v+n*tau2)) + (n**2 * (xbar-mu0)**2 * tau2) / (2*v*(v+n*tau2))
return log_bf # > log(1/alpha) 매 reject H0
```
### Bayesian A/B (PyMC)
```python
import pymc as pm
with pm.Model() as m:
p_a = pm.Beta("p_a", 1, 1)
p_b = pm.Beta("p_b", 1, 1)
pm.Binomial("y_a", n=10_000, p=p_a, observed=520)
pm.Binomial("y_b", n=10_000, p=p_b, observed=580)
diff = pm.Deterministic("diff", p_b - p_a)
idata = pm.sample(2000, chains=4, random_seed=42)
print(f"P(B > A) = {(idata.posterior['diff'] > 0).mean().item():.3f}")
```
### Permutation test
```python
def permutation_test(a, b, n=10_000):
diff_obs = np.mean(a) - np.mean(b)
pool = np.concatenate([a, b])
rng = np.random.default_rng(0)
diffs = []
for _ in range(n):
rng.shuffle(pool)
diffs.append(np.mean(pool[:len(a)]) - np.mean(pool[len(a):]))
return np.mean(np.abs(diffs) >= abs(diff_obs))
```
### SRE: Welch's test on latency p99
```python
# 매 deploy 전후 latency p99 비교
from scipy.stats import ttest_ind
before_p99 = np.array([124, 130, 128, 132, 125]) # ms
after_p99 = np.array([142, 138, 145, 140, 144])
t, p = ttest_ind(before_p99, after_p99, equal_var=False)
if p < 0.01: print("매 regression detected — rollback")
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| Fixed-N A/B | t-test or chi-squared |
| Continuous monitoring | mSPRT or always-valid CI |
| Small N, non-normal | Bootstrap or permutation |
| Multi-arm + prior | Bayesian (Beta-Binomial) |
**기본값**: Bootstrap CI + sequential test 의 production A/B.
## 🔗 Graph
- 부모: [[Statistics & Data Analysis]] · [[Probability Theory]]
- 변형: [[Bayesian_Inference|Bayesian Inference]]
- 응용: [[SRE]] · [[Anomaly-Detection]]
- Adjacent: [[Type 1 vs Type 2 Errors]] · [[Power Analysis]]
## 🤖 LLM 활용
**언제**: test 선택 의 advice (data shape → test type), 의 result interpretation.
**언제 X**: 매 multiple-comparison correction 매 자동화 X — domain knowledge 필요.
## ❌ 안티패턴
- **p-hacking**: 매 multiple test 후 cherry-pick.
- **Peeking**: fixed-N test 의 매 day 확인 → α inflation.
- **Single point**: CI 매 보고 안하고 mean 만.
- **N=∞ → significance ≠ effect size**: Cohen's d 도 같이.
## 🧪 검증 / 중복
- Verified (Casella & Berger "Statistical Inference", scipy/statsmodels docs).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — frequentist + Bayesian + sequential pattern |