[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,62 +2,149 @@
 id: wiki-2026-0508-loss-functions-foundations
 title: Loss Functions Foundations
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [DL-LOSS-001]
+aliases: [Loss Functions, Cost Functions, Objective Functions, Loss-Functions]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 1.0
-tags: [ai, Deep-Learning, loss-function, cost-function, Optimization, neural-networks]
+confidence_score: 0.95
+verification_status: applied
+tags: [loss, objective, training, mse, cross-entropy, focal, contrastive, dice]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
+tech_stack: { language: Python, framework: PyTorch }
 ---

-# [[Loss Functions|Loss Functions]] Foundations (손실 함수 기초)
+# Loss Functions Foundations

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "모델의 실수(Error)를 뼈아픈 수치로 환산하여, 정답을 향한 가장 가파른 길을 가리키는 나침반으로 삼아라" — 모델의 예측값과 실제 정답 사이의 차이를 하나의 스칼라 값으로 정의하여, 경사 하강법(Gradient Descent)이 최소값을 향해 나아갈 수 있도록 학습의 방향을 결정하는 핵심 지표.
+## 매 한 줄
+> **"매 loss는 task가 정한다"**. Regression→MSE/MAE/Huber, Classification→CE/Focal, Metric→Contrastive/Triplet, Segmentation→Dice/IoU.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** "Differentiable Error Mapping" — 불연속적인 '맞고 틀림'을 미분 가능한 연속적인 함수로 변환하여, 오차가 클수록 더 강한 피드백(Gradient)을 가중치에 전달함으로써 모델을 스스로 수정하게 만드는 최적화 지표 패턴.
- **주요 손실 함수:**
-    - **MSE (Mean Squared Error):** 예측 오차의 제곱 평균. 회귀 문제의 표준. 큰 오차에 민감함.
-    - **[[Cross-Entropy Loss|Cross-Entropy Loss]]:** 확률 분포 간의 차이 측정. 분류 문제의 표준. 정답에서 멀어질수록 페널티가 기하급수적으로 증가.
-    - **Hinge Loss:** 서포트 벡터 머신(SVM)에서 사용. 경계선(Margin)을 지키지 못할 때 벌점 부여.
- **의의:** 손실 함수의 설계가 곧 모델의 '목표'를 설정하는 행위이며, 문제의 본질(분류, 회귀, 생성 등)에 맞는 적절한 함수 선택이 성능의 80%를 결정함.
+## 매 핵심
+### 매 회귀 (Regression)
+- **MSE (L2)**: ½(y-ŷ)². 미분 깔끔, outlier에 민감.
+- **MAE (L1)**: |y-ŷ|. robust, 0에서 미분 불가.
+- **Huber**: |e|<δ면 MSE, 아니면 MAE. δ=1 기본.
+- **Log-cosh**: smooth Huber. 자동 미분 친화.
+- **Quantile**: max(τe, (τ-1)e). 중앙값/구간 예측.

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 단순히 오차를 줄이는 것에서 벗어나, 최근에는 학습의 안정성을 위해 Focal Loss(불균형 데이터)나 정규화 항이 포함된 복합 손실 함수를 설계하여 모델의 일반화 능력을 정교하게 제어함.
- **정책 변화:** Antigravity 프로젝트는 에이전트의 지식 강화 작업 품질을 평가할 때, 단순 정답률 외에도 코사인 유사도와 정보 엔트로피를 결합한 커스텀 손실 지표를 활용하여 지식의 밀도를 관리함.
+### 매 분류 (Classification)
+- **BCE**: -[y log p + (1-y) log(1-p)]. 이진/다중라벨.
+- **CE (softmax)**: -Σ y_k log p_k. 다중클래스.
+- **Focal** (Lin 2017): -α(1-p)^γ log p. easy example down-weight, γ=2 기본.
+- **Label smoothing**: y → y(1-ε) + ε/K. overconfidence 방지.
+- **Hinge**: max(0, 1-y·ŷ). SVM. y∈{-1,+1}.

-## 🔗 지식 연결 (Graph)
- [[Gradient-Descent|Gradient-Descent]]-Foundations, [[Backpropagation|Backpropagation]]-Foundations, [[Focal-Loss|Focal-Loss]], [[Kullback-Leibler-Divergence|Kullback-Leibler-Divergence]]
- **Raw Source:** 10_Wiki/Topics/AI/Loss-Functions-Foundations.md
+### 매 Metric Learning
+- **Contrastive** (Hadsell 2006): pair. y·d² + (1-y)·max(0, m-d)².
+- **Triplet**: max(0, d(a,p) - d(a,n) + margin).
+- **InfoNCE / NT-Xent** (SimCLR): -log exp(sim+/τ) / Σ exp(sim/τ).
+- **Cosine embedding**: 1 - cos(a,b).

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+### 매 Segmentation
+- **Dice**: 1 - 2|A∩B|/(|A|+|B|). class imbalance 강함.
+- **IoU/Jaccard**: 1 - |A∩B|/|A∪B|.
+- **Tversky**: FP/FN weight 조정.
+- **Boundary loss**: 거리변환 가중.

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+## 💻 패턴
+### Regression losses
+```python
+import torch, torch.nn.functional as F
+mse = F.mse_loss(pred, y)
+mae = F.l1_loss(pred, y)
+huber = F.huber_loss(pred, y, delta=1.0)        # smooth_l1 ≈ huber(δ=1)
+# Quantile
+def quantile_loss(pred, y, tau=0.5):
+    e = y - pred
+    return torch.maximum(tau*e, (tau-1)*e).mean()
+```

-**언제 쓰면 안 되는가:**
- *(TODO)*
+### Classification losses
+```python
+ce = F.cross_entropy(logits, y_int)               # logits, not probs
+bce = F.binary_cross_entropy_with_logits(logits, y_float)
+# Label smoothing (built-in)
+ce_ls = F.cross_entropy(logits, y_int, label_smoothing=0.1)
+```

-## 🧪 검증 상태 (Validation)
+### Focal loss (이진)
+```python
+def focal_bce(logits, y, alpha=0.25, gamma=2.0):
+    p = torch.sigmoid(logits)
+    pt = torch.where(y == 1, p, 1 - p)
+    alpha_t = torch.where(y == 1, alpha, 1 - alpha)
+    return -(alpha_t * (1 - pt).pow(gamma) * pt.clamp_min(1e-8).log()).mean()
+```

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+### Triplet & InfoNCE
+```python
+triplet = F.triplet_margin_loss(anchor, pos, neg, margin=1.0)

-## 🧬 중복 검사 (Duplicate Check)
+def info_nce(q, k_pos, k_neg, tau=0.07):
+    # q: (B,D), k_pos: (B,D), k_neg: (B,N,D)
+    pos = (q * k_pos).sum(-1, keepdim=True) / tau
+    neg = torch.einsum("bd,bnd->bn", q, k_neg) / tau
+    logits = torch.cat([pos, neg], dim=1)
+    target = torch.zeros(q.size(0), dtype=torch.long, device=q.device)
+    return F.cross_entropy(logits, target)
+```

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+### Dice + BCE (segmentation 표준)
+```python
+def dice_loss(logits, y, eps=1e-6):
+    p = torch.sigmoid(logits)
+    inter = (p * y).sum(dim=(2, 3))
+    union = p.sum(dim=(2, 3)) + y.sum(dim=(2, 3))
+    return 1 - (2 * inter + eps) / (union + eps)

-## 🕓 변경 이력 (Changelog)
+def combo_loss(logits, y):
+    return 0.5 * F.binary_cross_entropy_with_logits(logits, y) + dice_loss(logits, y).mean()
+```

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+### Class imbalance 가중
+```python
+weights = torch.tensor([1.0, 5.0, 2.0])           # 클래스별
+ce_w = F.cross_entropy(logits, y_int, weight=weights)
+```
+
+## 매 결정 기준
+| Task | Default | 변형 |
+|---|---|---|
+| Regression normal | MSE | outlier→Huber, robust→MAE |
+| Binary classification | BCE w/ logits | imbalance→Focal |
+| Multi-class | CE w/ label smoothing | imbalance→class weights |
+| Multi-label | BCE per-class |  |
+| Embedding learning | InfoNCE | small batch→Triplet |
+| Segmentation | BCE+Dice | small object→Tversky |
+| Object detection | Focal + IoU/GIoU | (RetinaNet, YOLO) |
+
+**기본값**: classification CE+label smoothing 0.1, regression Huber.
+
+## 🔗 Graph
+- 부모: [[Optimization]], [[Training-Loop]]
+- 변형: [[Focal-Loss]], [[Contrastive-Loss]], [[Dice-Loss]]
+- 응용: [[Image-Classification]], [[Segmentation]], [[Metric-Learning]], [[Object-Detection]]
+- Adjacent: [[Activation-Functions]], [[Class-Imbalance]], [[Regularization]]
+
+## 🤖 LLM 활용
+**언제**: task→loss 매핑, gradient 직관, 코드 템플릿 생성.
+**언제 X**: domain-specific custom loss 설계는 검증 필수 (분포·gradient 분석).
+
+## ❌ 안티패턴
+- `softmax` 후 `nll_loss` 손수 (numerical) ← `cross_entropy` 사용
+- BCE에 `binary_cross_entropy(sigmoid(...))` ← `_with_logits` 사용
+- Imbalance 무시한 CE
+- Dice loss만 단독 (gradient 불안정) → BCE+Dice 혼합
+- Focal γ를 imbalance 없을 때 사용 (성능↓)
+
+## 🧪 검증 / 중복
+- Verified (Goodfellow DL ch5, Lin 2017 Focal, SimCLR, Milletari V-Net Dice). 신뢰도 A.
+- Canonical for [[Loss Functions]] (redirect).
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — canonical 강화, segmentation/metric 추가 |