Files

T

Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization

10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 23:52:15 +09:00

4.4 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Ridge Regression

매 한 줄

"매 L2 penalty 의 OLS 의 stabilize". Hoerl & Kennard (1970) 의 multicollinearity 의 fix 의 introduce — 매 modern ML 의 baseline regularizer 의 사용 (sklearn Ridge, RidgeCV).

매 핵심

매 Loss function

OLS: min ||y - Xβ||²
Ridge: min ||y - Xβ||² + α||β||²
α (alpha) → regularization strength. α=0 → OLS. α→∞ → β→0.

매 Closed form

β̂ = (XᵀX + αI)⁻¹ Xᵀy
XᵀX + αI 는 invertible — 매 multicollinear 한 X 도 OK.
OLS 의 (XᵀX)⁻¹ 는 singular 가능 → ridge 가 fix.

매 응용

Multicollinear features (correlated predictors).
p > n (features more than samples) — gene expression, fMRI.
Baseline 의 sklearn pipelines.

💻 패턴

Sklearn Ridge

from sklearn.linear_model import Ridge, RidgeCV
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

# Always scale before ridge — penalty is scale-dependent
model = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
model.fit(X_train, y_train)
print(model.score(X_test, y_test))

Cross-validated alpha

# RidgeCV picks best alpha from grid
ridge = RidgeCV(alphas=[0.01, 0.1, 1.0, 10.0, 100.0], cv=5)
ridge.fit(X_train, y_train)
print(f"Best alpha: {ridge.alpha_}")

Closed-form by hand

import numpy as np

def ridge_fit(X, y, alpha):
    n, p = X.shape
    I = np.eye(p)
    return np.linalg.solve(X.T @ X + alpha * I, X.T @ y)

beta = ridge_fit(X_train, y_train, alpha=1.0)

SVD-based ridge (numerically stable)

def ridge_svd(X, y, alpha):
    U, s, Vt = np.linalg.svd(X, full_matrices=False)
    d = s / (s**2 + alpha)
    return Vt.T @ (d * (U.T @ y))

Kernel Ridge

from sklearn.kernel_ridge import KernelRidge

# Non-linear ridge via kernel trick
krr = KernelRidge(alpha=1.0, kernel='rbf', gamma=0.1)
krr.fit(X_train, y_train)

Bayesian view

from sklearn.linear_model import BayesianRidge

# Ridge as Gaussian prior on β with variance 1/α
br = BayesianRidge()
br.fit(X_train, y_train)
print(br.coef_, br.alpha_)  # learned alpha

Regularization path

import numpy as np
from sklearn.linear_model import Ridge

alphas = np.logspace(-3, 3, 50)
coefs = []
for a in alphas:
    r = Ridge(alpha=a).fit(X, y)
    coefs.append(r.coef_)
# Plot coefs vs log(alpha) — see shrinkage

매 결정 기준

상황	Approach
Multicollinear features	Ridge
Need feature selection	Lasso (L1)
Mix of sparsity + grouping	Elastic Net
Non-linear pattern	Kernel Ridge
Bayesian uncertainty	BayesianRidge

기본값: RidgeCV with log-spaced alpha grid + StandardScaler.

🔗 Graph

부모: Linear-Regression · L1-and-L2-Regularization
변형: Elastic-Net
응용: Feature Engineering · Bias vs Variance
Adjacent: Singular-Value-Decomposition

🤖 LLM 활용

언제: Tabular regression 의 strong baseline. Multicollinear features (correlated predictors) 의 시. p > n 의 high-dim setting. Linear model interpretability 의 keep. 언제 X: Sparse feature selection 의 필요 (use Lasso). Strong non-linearity (use trees/NN). N >> p 의 와 features uncorrelated → OLS 도 충분.

❌ 안티패턴

No scaling: Ridge penalty 의 scale-sensitive — features 의 다른 scale 의 → 매 unfair shrinkage.
Manual alpha pick: 매 RidgeCV 의 use, magic number alpha=1.0 의 X.
Ridge for sparsity: L2 의 X coefficient 의 zero 의 안 만든다 — Lasso 의 use.
Ignoring intercept: sklearn 의 default 는 intercept 의 X regularize — but custom impl 의 watch.

🧪 검증 / 중복

Verified (Hoerl & Kennard 1970, ESL Ch.3, sklearn docs).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — Ridge regression with closed-form, SVD, kernel, Bayesian variants

4.4 KiB Raw Blame History Unescape Escape