Files

T

Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization

10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 23:52:15 +09:00

6.2 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Principal Component Analysis

매 한 줄

"매 orthogonal axes of maximum variance — eigendecomposition of covariance, equivalent to SVD of centered data". Pearson 1901, Hotelling 1933 의 statistical foundation; 2026 still the default linear dim-reduction baseline despite t-SNE/UMAP for viz. Note: spelled Principal (not "Principle") — kept alias for findability.

매 핵심

매 mathematical definition

Center data: X_c = X - mean(X).
Covariance: C = X_c^T X_c / (n-1).
Eigendecompose C = V Λ V^T; columns of V are principal axes.
Project: Z = X_c V_k (top k components).
Equivalent: SVD X_c = U Σ V^T → V same; singular values σ_i = sqrt((n-1) λ_i).

매 properties

Orthogonal: components uncorrelated.
Variance-ordered: first PC explains most variance.
Linear: cannot capture curved manifolds (use kernel PCA / UMAP).
Rotation-invariant: same answer regardless of axis labels.
Scale-sensitive: standardize features first if scales differ.

매 variants

Kernel PCA: nonlinear via kernel trick (RBF, polynomial).
Sparse PCA: L1-regularized loadings for interpretability.
Robust PCA: low-rank + sparse decomposition for outliers.
Probabilistic PCA: latent Gaussian model — gives MLE objective.
Incremental / online PCA: streaming data.
Randomized SVD: O(n d k) instead of O(n d^2) for top-k.

매 modern usage (2026)

Embeddings analysis: PCA on Claude / GPT-5 hidden states for interpretability (mech interp).
Whitening: precondition before clustering, ICA, neural net training.
Compression: still used in image / signal pipelines.
Data viz: PCA → 50D, then UMAP/t-SNE → 2D (the standard combo).

💻 패턴

scikit-learn PCA

from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
import numpy as np

X_std = StandardScaler().fit_transform(X)
pca = PCA(n_components=0.95)  # keep 95% variance
Z = pca.fit_transform(X_std)
print(f"#components for 95% var: {pca.n_components_}")
print(f"explained variance ratio: {pca.explained_variance_ratio_}")

Manual PCA via SVD (numerical best)

def pca(X, k):
    Xc = X - X.mean(0)
    U, s, Vt = np.linalg.svd(Xc, full_matrices=False)
    components = Vt[:k]
    explained_var = (s[:k] ** 2) / (X.shape[0] - 1)
    Z = Xc @ components.T
    return Z, components, explained_var

Randomized SVD (fast for huge matrices)

from sklearn.utils.extmath import randomized_svd
U, s, Vt = randomized_svd(X_centered, n_components=50, random_state=42)
# 100x faster than full SVD for d >> k

Kernel PCA (nonlinear)

from sklearn.decomposition import KernelPCA
kpca = KernelPCA(n_components=2, kernel="rbf", gamma=0.1)
Z = kpca.fit_transform(X)

Incremental PCA (streaming)

from sklearn.decomposition import IncrementalPCA
ipca = IncrementalPCA(n_components=50, batch_size=1024)
for batch in stream:
    ipca.partial_fit(batch)
Z = ipca.transform(X_test)

Whitening before downstream model

pca = PCA(whiten=True).fit(X_train)
X_train_w = pca.transform(X_train)
X_test_w  = pca.transform(X_test)
# now features have unit variance, zero correlation

PCA for interpreting transformer hidden states

import torch
hidden = model.encode(prompts)  # (B, D=4096)
pca = PCA(n_components=8)
Z = pca.fit_transform(hidden.cpu().numpy())
# Top component often correlates with sentiment / topic / refusal.

Reconstruction error (anomaly detection)

pca = PCA(n_components=10).fit(X_train)
recon = pca.inverse_transform(pca.transform(X))
err = ((X - recon) ** 2).sum(axis=1)
anomalies = err > np.percentile(err, 99)

Choosing k via scree plot / elbow

import matplotlib.pyplot as plt
pca_full = PCA().fit(X_std)
plt.plot(np.cumsum(pca_full.explained_variance_ratio_))
plt.axhline(0.95, ls="--"); plt.xlabel("# components"); plt.ylabel("cumulative var")

매 결정 기준

상황	Approach
Linear dim-reduction baseline	PCA
Visualization to 2D	PCA→50D → UMAP→2D
Nonlinear manifold	Kernel PCA / UMAP / autoencoder
Streaming / huge data	IncrementalPCA / randomized SVD
Need interpretable loadings	Sparse PCA
Outliers in data	Robust PCA
Probabilistic / missing data	Probabilistic PCA / EM-PCA

기본값: StandardScaler → PCA(n_components=0.95) → downstream model.

🔗 Graph

부모: Linear-Algebra-Foundations · Dimensionality-Reduction
응용: Feature Engineering · Anomaly-Detection · Mechanistic-Interpretability
Adjacent: SVD · ICA · Factor-Analysis · Autoencoder · UMAP

🤖 LLM 활용

언제: linear dim-reduction, whitening, denoising, hidden-state analysis, baseline before ML model. 언제 X: nonlinear manifold (use UMAP/autoencoder), categorical-only data (use MCA), interpretable original features required (use feature selection).

❌ 안티패턴

No standardization: features with large scale dominate components.
PCA on labels-included data: leakage if used for supervised pipeline.
Reading PC1 as "the cause": components are statistical, not causal.
PCA → tree models: GBDT doesn't benefit from rotation; just hurts interpretability.
Forgetting sign ambiguity: V and -V both valid; component direction is arbitrary.

🧪 검증 / 중복

Verified (Pearson 1901, Hotelling 1933, Jolliffe 2002 textbook, sklearn docs).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — canonical PCA reference + 2026 mech interp use

6.2 KiB Raw Blame History Unescape Escape