Files
2nd/10_Wiki/Topics/Computer_Science_and_Theory/Ridge-Regression.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

4.4 KiB
Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-ridge-regression Ridge Regression 10_Wiki/Topics verified self
L2 Regularization
Tikhonov Regularization
none A 0.9 applied
machine-learning
regression
regularization
statistics
2026-05-10 pending
language framework
Python scikit-learn

Ridge Regression

매 한 줄

"매 L2 penalty 의 OLS 의 stabilize". Hoerl & Kennard (1970) 의 multicollinearity 의 fix 의 introduce — 매 modern ML 의 baseline regularizer 의 사용 (sklearn Ridge, RidgeCV).

매 핵심

매 Loss function

  • OLS: min ||y - Xβ||²
  • Ridge: min ||y - Xβ||² + α||β||²
  • α (alpha) → regularization strength. α=0 → OLS. α→∞ → β→0.

매 Closed form

  • β̂ = (XᵀX + αI)⁻¹ Xᵀy
  • XᵀX + αI 는 invertible — 매 multicollinear 한 X 도 OK.
  • OLS 의 (XᵀX)⁻¹ 는 singular 가능 → ridge 가 fix.

매 응용

  1. Multicollinear features (correlated predictors).
  2. p > n (features more than samples) — gene expression, fMRI.
  3. Baseline 의 sklearn pipelines.

💻 패턴

Sklearn Ridge

from sklearn.linear_model import Ridge, RidgeCV
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

# Always scale before ridge — penalty is scale-dependent
model = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
model.fit(X_train, y_train)
print(model.score(X_test, y_test))

Cross-validated alpha

# RidgeCV picks best alpha from grid
ridge = RidgeCV(alphas=[0.01, 0.1, 1.0, 10.0, 100.0], cv=5)
ridge.fit(X_train, y_train)
print(f"Best alpha: {ridge.alpha_}")

Closed-form by hand

import numpy as np

def ridge_fit(X, y, alpha):
    n, p = X.shape
    I = np.eye(p)
    return np.linalg.solve(X.T @ X + alpha * I, X.T @ y)

beta = ridge_fit(X_train, y_train, alpha=1.0)

SVD-based ridge (numerically stable)

def ridge_svd(X, y, alpha):
    U, s, Vt = np.linalg.svd(X, full_matrices=False)
    d = s / (s**2 + alpha)
    return Vt.T @ (d * (U.T @ y))

Kernel Ridge

from sklearn.kernel_ridge import KernelRidge

# Non-linear ridge via kernel trick
krr = KernelRidge(alpha=1.0, kernel='rbf', gamma=0.1)
krr.fit(X_train, y_train)

Bayesian view

from sklearn.linear_model import BayesianRidge

# Ridge as Gaussian prior on β with variance 1/α
br = BayesianRidge()
br.fit(X_train, y_train)
print(br.coef_, br.alpha_)  # learned alpha

Regularization path

import numpy as np
from sklearn.linear_model import Ridge

alphas = np.logspace(-3, 3, 50)
coefs = []
for a in alphas:
    r = Ridge(alpha=a).fit(X, y)
    coefs.append(r.coef_)
# Plot coefs vs log(alpha) — see shrinkage

매 결정 기준

상황 Approach
Multicollinear features Ridge
Need feature selection Lasso (L1)
Mix of sparsity + grouping Elastic Net
Non-linear pattern Kernel Ridge
Bayesian uncertainty BayesianRidge

기본값: RidgeCV with log-spaced alpha grid + StandardScaler.

🔗 Graph

🤖 LLM 활용

언제: Tabular regression 의 strong baseline. Multicollinear features (correlated predictors) 의 시. p > n 의 high-dim setting. Linear model interpretability 의 keep. 언제 X: Sparse feature selection 의 필요 (use Lasso). Strong non-linearity (use trees/NN). N >> p 의 와 features uncorrelated → OLS 도 충분.

안티패턴

  • No scaling: Ridge penalty 의 scale-sensitive — features 의 다른 scale 의 → 매 unfair shrinkage.
  • Manual alpha pick: 매 RidgeCV 의 use, magic number alpha=1.0 의 X.
  • Ridge for sparsity: L2 의 X coefficient 의 zero 의 안 만든다 — Lasso 의 use.
  • Ignoring intercept: sklearn 의 default 는 intercept 의 X regularize — but custom impl 의 watch.

🧪 검증 / 중복

  • Verified (Hoerl & Kennard 1970, ESL Ch.3, sklearn docs).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — Ridge regression with closed-form, SVD, kernel, Bayesian variants