Files
2nd/10_Wiki/Topics/AI_and_ML/Loss-Functions-Foundations.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

151 lines
5.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-loss-functions-foundations
title: Loss Functions Foundations
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Loss Functions, Cost Functions, Objective Functions, Loss-Functions]
duplicate_of: none
source_trust_level: A
confidence_score: 0.95
verification_status: applied
tags: [loss, objective, training, mse, cross-entropy, focal, contrastive, dice]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack: { language: Python, framework: PyTorch }
---
# Loss Functions Foundations
## 매 한 줄
> **"매 loss는 task가 정한다"**. Regression→MSE/MAE/Huber, Classification→CE/Focal, Metric→Contrastive/Triplet, Segmentation→Dice/IoU.
## 매 핵심
### 매 회귀 (Regression)
- **MSE (L2)**: ½(y-ŷ)². 미분 깔끔, outlier에 민감.
- **MAE (L1)**: |y-ŷ|. robust, 0에서 미분 불가.
- **Huber**: |e|<δ면 MSE, 아니면 MAE. δ=1 기본.
- **Log-cosh**: smooth Huber. 자동 미분 친화.
- **Quantile**: max(τe, (τ-1)e). 중앙값/구간 예측.
### 매 분류 (Classification)
- **BCE**: -[y log p + (1-y) log(1-p)]. 이진/다중라벨.
- **CE (softmax)**: -Σ y_k log p_k. 다중클래스.
- **Focal** (Lin 2017): -α(1-p)^γ log p. easy example down-weight, γ=2 기본.
- **Label smoothing**: y → y(1-ε) + ε/K. overconfidence 방지.
- **Hinge**: max(0, 1-y·ŷ). SVM. y∈{-1,+1}.
### 매 Metric Learning
- **Contrastive** (Hadsell 2006): pair. y·d² + (1-y)·max(0, m-d)².
- **Triplet**: max(0, d(a,p) - d(a,n) + margin).
- **InfoNCE / NT-Xent** (SimCLR): -log exp(sim+/τ) / Σ exp(sim/τ).
- **Cosine embedding**: 1 - cos(a,b).
### 매 Segmentation
- **Dice**: 1 - 2|A∩B|/(|A|+|B|). class imbalance 강함.
- **IoU/Jaccard**: 1 - |A∩B|/|AB|.
- **Tversky**: FP/FN weight 조정.
- **Boundary loss**: 거리변환 가중.
## 💻 패턴
### Regression losses
```python
import torch, torch.nn.functional as F
mse = F.mse_loss(pred, y)
mae = F.l1_loss(pred, y)
huber = F.huber_loss(pred, y, delta=1.0) # smooth_l1 ≈ huber(δ=1)
# Quantile
def quantile_loss(pred, y, tau=0.5):
e = y - pred
return torch.maximum(tau*e, (tau-1)*e).mean()
```
### Classification losses
```python
ce = F.cross_entropy(logits, y_int) # logits, not probs
bce = F.binary_cross_entropy_with_logits(logits, y_float)
# Label smoothing (built-in)
ce_ls = F.cross_entropy(logits, y_int, label_smoothing=0.1)
```
### Focal loss (이진)
```python
def focal_bce(logits, y, alpha=0.25, gamma=2.0):
p = torch.sigmoid(logits)
pt = torch.where(y == 1, p, 1 - p)
alpha_t = torch.where(y == 1, alpha, 1 - alpha)
return -(alpha_t * (1 - pt).pow(gamma) * pt.clamp_min(1e-8).log()).mean()
```
### Triplet & InfoNCE
```python
triplet = F.triplet_margin_loss(anchor, pos, neg, margin=1.0)
def info_nce(q, k_pos, k_neg, tau=0.07):
# q: (B,D), k_pos: (B,D), k_neg: (B,N,D)
pos = (q * k_pos).sum(-1, keepdim=True) / tau
neg = torch.einsum("bd,bnd->bn", q, k_neg) / tau
logits = torch.cat([pos, neg], dim=1)
target = torch.zeros(q.size(0), dtype=torch.long, device=q.device)
return F.cross_entropy(logits, target)
```
### Dice + BCE (segmentation 표준)
```python
def dice_loss(logits, y, eps=1e-6):
p = torch.sigmoid(logits)
inter = (p * y).sum(dim=(2, 3))
union = p.sum(dim=(2, 3)) + y.sum(dim=(2, 3))
return 1 - (2 * inter + eps) / (union + eps)
def combo_loss(logits, y):
return 0.5 * F.binary_cross_entropy_with_logits(logits, y) + dice_loss(logits, y).mean()
```
### Class imbalance 가중
```python
weights = torch.tensor([1.0, 5.0, 2.0]) # 클래스별
ce_w = F.cross_entropy(logits, y_int, weight=weights)
```
## 매 결정 기준
| Task | Default | 변형 |
|---|---|---|
| Regression normal | MSE | outlier→Huber, robust→MAE |
| Binary classification | BCE w/ logits | imbalance→Focal |
| Multi-class | CE w/ label smoothing | imbalance→class weights |
| Multi-label | BCE per-class | |
| Embedding learning | InfoNCE | small batch→Triplet |
| Segmentation | BCE+Dice | small object→Tversky |
| Object detection | Focal + IoU/GIoU | (RetinaNet, YOLO) |
**기본값**: classification CE+label smoothing 0.1, regression Huber.
## 🔗 Graph
- 부모: [[Optimization]]
- 변형: [[Focal-Loss]]
- 응용: [[Image-Classification-Mastery]], [[Segmentation]], [[Object-Detection]]
- Adjacent: [[Activation-Functions]], [[Class-Imbalance]], [[L1-and-L2-Regularization|Regularization]]
## 🤖 LLM 활용
**언제**: task→loss 매핑, gradient 직관, 코드 템플릿 생성.
**언제 X**: domain-specific custom loss 설계는 검증 필수 (분포·gradient 분석).
## ❌ 안티패턴
- `softmax``nll_loss` 손수 (numerical) ← `cross_entropy` 사용
- BCE에 `binary_cross_entropy(sigmoid(...))``_with_logits` 사용
- Imbalance 무시한 CE
- Dice loss만 단독 (gradient 불안정) → BCE+Dice 혼합
- Focal γ를 imbalance 없을 때 사용 (성능↓)
## 🧪 검증 / 중복
- Verified (Goodfellow DL ch5, Lin 2017 Focal, SimCLR, Milletari V-Net Dice). 신뢰도 A.
- Canonical for [[Loss-Functions-Foundations|Loss Functions]] (redirect).
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — canonical 강화, segmentation/metric 추가 |