"매 effect 가 있을 때 매 detect 의 확률 = 1 − β". Power 의 true positive rate — sample size, effect size, α, variance 의 함수. 1928 Neyman-Pearson 에서 등장 — 2026 A/B test, clinical trial, 모든 hypothesis test 의 design 의 핵심.
매 핵심
매 4 quantities (lock 3, solve 1)
n — sample size.
α — Type I error rate (false positive), 보통 0.05.
β — Type II error rate (false negative); power = 1 − β, 보통 0.8.
effect size — Cohen's d (means), Cohen's h (props), r (correlation), f² (regression).
매 effect size 의 convention (Cohen)
d = 0.2 (small), 0.5 (medium), 0.8 (large) — for two-sample t-test.
매 domain-specific minimum detectable effect (MDE) 의 정의 의 우선 — Cohen 의 default 의 X.
매 sample size 공식 (two-sample t-test)
n_per_group ≈ 2·(z_{1-α/2} + z_{1-β})² · σ² / Δ²
α=0.05, power=0.8, Δ=0.5σ (medium d) → n ≈ 64/group.
매 응용
A/B test design — pre-experiment sample-size calc.
Sequential testing — α-spending for early stopping.
💻 패턴
1. statsmodels — t-test power
fromstatsmodels.stats.powerimportTTestIndPoweranalysis=TTestIndPower()n=analysis.solve_power(effect_size=0.5,alpha=0.05,power=0.8,ratio=1.0)# n ≈ 63.77 per group
2. Proportion (A/B test) sample size
fromstatsmodels.stats.proportionimportproportion_effectsizefromstatsmodels.stats.powerimportNormalIndPoweres=proportion_effectsize(0.10,0.12)# baseline 10%, MDE 12%n=NormalIndPower().solve_power(effect_size=es,alpha=0.05,power=0.8)# 매 conversion uplift 2pp → ~3800 / arm
mde=analysis.solve_power(nobs1=1000,alpha=0.05,power=0.8)# 매 n=1000 에서 detect 가능한 minimum d
6. Bonferroni-corrected power (multiple tests)
k=10# 매 10 hypothesesadjusted_alpha=0.05/kn_corrected=analysis.solve_power(effect_size=0.5,alpha=adjusted_alpha,power=0.8)
7. Sequential / group-sequential (early stopping)
# 매 O'Brien-Fleming spending function 의 사용# 매 이미 누적 α 를 나눠서 spend → 매 final n 의 약간 증가fromstatsmodels.stats.proportionimportproportions_ztest# for fixed test# Group-sequential: see `gsDesign` in R or `confseq` in Python (2026)
8. Cluster-randomized (intra-class correlation)
# 매 design effect = 1 + (m-1)·ρ# 매 effective sample = n / DEicc=0.05;m=30# cluster sizeDE=1+(m-1)*icc# ≈ 2.45n_effective=1000/DE
매 결정 기준
상황
Approach
Mean comparison
TTestIndPower (Cohen's d)
Proportion (CTR, conversion)
NormalIndPower
Complex design / non-standard test
Simulation
Multiple testing
Bonferroni / FDR adjusted α
Early-stopping
Group-sequential / α-spending
Cluster trial
Inflate n by design effect
기본값: α=0.05, power=0.8, MDE 의 domain-driven 정의 → statsmodels solve_power.
언제: explaining tradeoffs (n vs MDE vs power), interpreting power-analysis result, recommending design.
언제 X: 매 numerical solve — statsmodels / G*Power 의 사용.
❌ 안티패턴
Post-hoc power on observed effect: 매 always ≈ 50% near significance — 매 meaningless. 매 MDE 의 reporting 의 사용.
Underpowered + p-hack: small n + multiple comparisons → false-positive 의 폭증.
Ignoring variance assumption: σ 의 estimate 의 잘못 → n calc 의 wrong.
Power=0.5 acceptable: 매 0.8 minimum (medical 0.9) — 매 50% 의 coin flip.
One-sided test for convenience: 매 effect direction 의 진짜 사전 hypothesis 면 OK, 매 fishing 면 X.
🧪 검증 / 중복
Verified (Cohen 1988 Statistical Power Analysis, statsmodels docs, ICH E9 guideline).
신뢰도 A.
🕓 Changelog
날짜
변경
2026-05-08
Phase 1
2026-05-10
Manual cleanup — power formulas, statsmodels patterns, A/B test design.