---
id: wiki-2026-0508-singular-value-decomposition
title: Singular Value Decomposition (SVD)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [SVD, Matrix Factorization]
duplicate_of: none
source_trust_level: A
confidence_score: 0.95
verification_status: applied
tags: [linear-algebra, matrix-factorization, dimensionality-reduction, machine-learning]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: Python
  framework: NumPy/SciPy
---

# Singular Value Decomposition (SVD)

## 매 한 줄
> **"매 matrix 의 universal factorization"**. Beltrami (1873) / Jordan (1874) 에서 origin — 매 modern ML/DL 의 foundational tool: PCA, recommendation, LLM weight compression (LoRA, 2026 vLLM/MLX 의 SVD-based pruning).

## 매 핵심

### 매 Decomposition
- `A = U Σ Vᵀ`
- `A` (m×n), `U` (m×m, orthogonal), `Σ` (m×n, diagonal, σ₁≥σ₂≥...≥0), `Vᵀ` (n×n, orthogonal).
- σᵢ = singular values (≥ 0). U columns = left singular vectors. V columns = right.

### 매 핵심 properties
- Always exists (any matrix, even non-square / singular).
- σᵢ² = eigenvalues of `AᵀA` (and `AAᵀ`).
- rank(A) = number of non-zero σᵢ.
- ||A||₂ = σ₁ (largest singular value).
- ||A||_F = √Σσᵢ² (Frobenius norm).

### 매 응용
1. PCA — top-k SVD of centered X.
2. Pseudoinverse `A⁺ = V Σ⁺ Uᵀ`.
3. Low-rank approximation (Eckart-Young theorem).
4. Recommender systems (Netflix, Funk SVD).
5. LoRA / weight compression (2026 LLM fine-tuning).
6. Image compression.

## 💻 패턴

### NumPy SVD
```python
import numpy as np

A = np.random.randn(100, 50)
U, s, Vt = np.linalg.svd(A, full_matrices=False)
# U: (100,50), s: (50,), Vt: (50,50)
# Reconstruct: A ≈ U @ np.diag(s) @ Vt
```

### Truncated SVD (low-rank approx)
```python
from sklearn.decomposition import TruncatedSVD

# Top-k components — Eckart-Young optimal rank-k approx
svd = TruncatedSVD(n_components=10)
X_reduced = svd.fit_transform(X)  # (n_samples, 10)
print(svd.explained_variance_ratio_.sum())
```

### PCA via SVD
```python
def pca_svd(X, k):
    X_centered = X - X.mean(axis=0)
    U, s, Vt = np.linalg.svd(X_centered, full_matrices=False)
    # Principal components = rows of Vt
    return X_centered @ Vt[:k].T  # project to k-dim
```

### Pseudoinverse
```python
def pinv_svd(A, rcond=1e-10):
    U, s, Vt = np.linalg.svd(A, full_matrices=False)
    s_inv = np.where(s > rcond * s.max(), 1/s, 0)
    return Vt.T @ np.diag(s_inv) @ U.T

# Solve least squares: x = A⁺ b
x = pinv_svd(A) @ b
```

### Image compression
```python
from PIL import Image
import numpy as np

img = np.array(Image.open("photo.jpg").convert("L"))
U, s, Vt = np.linalg.svd(img, full_matrices=False)
# Keep top-k singular values
k = 50
compressed = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
# Storage: m*k + k + k*n vs m*n
```

### Randomized SVD (large matrices)
```python
from sklearn.utils.extmath import randomized_svd

# Halko et al. 2011 — O(mn log k) instead of O(mn²)
U, s, Vt = randomized_svd(X_huge, n_components=20, random_state=42)
```

### LoRA-style low-rank weight update (2026)
```python
import torch

# Original frozen weight W (d_out, d_in)
# Learn ΔW = B @ A where B (d_out, r), A (r, d_in), r << min(d_out, d_in)
class LoRALayer(torch.nn.Module):
    def __init__(self, d_in, d_out, rank=8):
        super().__init__()
        self.A = torch.nn.Parameter(torch.randn(rank, d_in) * 0.01)
        self.B = torch.nn.Parameter(torch.zeros(d_out, rank))

    def forward(self, x, W_frozen):
        return x @ W_frozen.T + x @ self.A.T @ self.B.T
```

### Eckart-Young error bound
```python
# Best rank-k approx error in Frobenius norm
U, s, Vt = np.linalg.svd(A, full_matrices=False)
k = 5
A_k = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
error_frob = np.sqrt(np.sum(s[k:]**2))
assert np.isclose(np.linalg.norm(A - A_k, 'fro'), error_frob)
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| Dense small matrix | `np.linalg.svd` |
| Top-k only, large | `randomized_svd` / `TruncatedSVD` |
| Sparse matrix | `scipy.sparse.linalg.svds` |
| LLM weight adapter | LoRA (low-rank ΔW) |
| Recommender (sparse ratings) | Funk SVD / ALS |

**기본값**: full SVD via NumPy for small dense; randomized for large; sparse SVD for sparse.

## 🔗 Graph
- 부모: [[Linear-Algebra-Foundations|Linear-Algebra]] · [[Matrix-Factorization]]
- 변형: [[Eigendecomposition]]
- 응용: [[Principal-Component-Analysis]] · [[LoRA]] · [[Recommender-Systems]]
- Adjacent: [[Ridge-Regression]]

## 🤖 LLM 활용
**언제**: Dimensionality reduction (PCA). Pseudoinverse 의 compute. Low-rank approximation (compression, denoising). LLM weight 의 LoRA / SVD pruning. Spectral analysis.
**언제 X**: Very large sparse matrix (use iterative methods like Lanczos). Streaming data (use online PCA / incremental SVD).

## ❌ 안티패턴
- **`full_matrices=True` for fat/thin**: 매 wasted memory — `full_matrices=False` 의 사용.
- **Eigendecomposition of `AᵀA`**: 매 numerically unstable (squares condition number) — direct SVD 의 use.
- **Forgetting to center for PCA**: SVD on uncentered X = X dominated by mean direction.
- **Naive SVD on huge sparse**: O(mn²) — 매 randomized / Lanczos 의 use.

## 🧪 검증 / 중복
- Verified (Trefethen & Bau "Numerical Linear Algebra", Strang "Linear Algebra and Learning from Data", LAPACK gesdd).
- 신뢰도 A+.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — SVD with PCA, pseudoinverse, randomized, LoRA applications |