"매 spread = E[(X-μ)²] 의 sqrt". Variance 의 expected squared deviation, SD 의 unit-scale spread. 1894 Pearson 이 명명 — 매 모든 statistics 의 가장 fundamental dispersion measure.
매 핵심
매 정의
Variance: σ² = E[(X − μ)²] = E[X²] − (E[X])².
Standard deviation: σ = √Var(X) — 매 same unit as X.
Sample variance: s² = Σ(xᵢ − x̄)² / (n − 1) — 매 Bessel correction (n−1) 의 unbiased estimator.
Population variance: σ² = Σ(xᵢ − μ)² / N.
매 property
Var(aX + b) = a²·Var(X) — 매 shift invariant, scale 의 square.
# E[(y - ŷ)²] = Bias² + Variance + σ²_noise# 매 ML model selection 의 핵심
7. Robust alternatives — MAD
fromscipy.statsimportmedian_abs_deviationmad=median_abs_deviation(x,scale="normal")# ~1.4826 * raw MAD ≈ σ for Gaussian
8. Two-pass safe variance (numerically)
# Naive E[X²] - E[X]² → 매 catastrophic cancellation 의 위험 (large mean, small var)# 매 2-pass: μ first, then Σ(x-μ)²mean=x.sum()/nvar=((x-mean)**2).sum()/(n-1)
매 결정 기준
상황
Approach
Sample (inference)
ddof=1 (n−1 division)
Population (descriptive)
ddof=0
Streaming / online
Welford
Outlier-heavy data
MAD, IQR (not SD)
Heavy-tail distribution
Quantile-based (P95/P5)
Volatility (finance)
Rolling SD × √(periods/year)
기본값: np.std(x, ddof=1) — 매 sample SD, 매 unbiased point estimator.