f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
151 lines
5.2 KiB
Markdown
151 lines
5.2 KiB
Markdown
---
|
||
id: wiki-2026-0508-loss-functions-foundations
|
||
title: Loss Functions Foundations
|
||
category: 10_Wiki/Topics
|
||
status: verified
|
||
canonical_id: self
|
||
aliases: [Loss Functions, Cost Functions, Objective Functions, Loss-Functions]
|
||
duplicate_of: none
|
||
source_trust_level: A
|
||
confidence_score: 0.95
|
||
verification_status: applied
|
||
tags: [loss, objective, training, mse, cross-entropy, focal, contrastive, dice]
|
||
raw_sources: []
|
||
last_reinforced: 2026-05-10
|
||
github_commit: pending
|
||
tech_stack: { language: Python, framework: PyTorch }
|
||
---
|
||
|
||
# Loss Functions Foundations
|
||
|
||
## 매 한 줄
|
||
> **"매 loss는 task가 정한다"**. Regression→MSE/MAE/Huber, Classification→CE/Focal, Metric→Contrastive/Triplet, Segmentation→Dice/IoU.
|
||
|
||
## 매 핵심
|
||
### 매 회귀 (Regression)
|
||
- **MSE (L2)**: ½(y-ŷ)². 미분 깔끔, outlier에 민감.
|
||
- **MAE (L1)**: |y-ŷ|. robust, 0에서 미분 불가.
|
||
- **Huber**: |e|<δ면 MSE, 아니면 MAE. δ=1 기본.
|
||
- **Log-cosh**: smooth Huber. 자동 미분 친화.
|
||
- **Quantile**: max(τe, (τ-1)e). 중앙값/구간 예측.
|
||
|
||
### 매 분류 (Classification)
|
||
- **BCE**: -[y log p + (1-y) log(1-p)]. 이진/다중라벨.
|
||
- **CE (softmax)**: -Σ y_k log p_k. 다중클래스.
|
||
- **Focal** (Lin 2017): -α(1-p)^γ log p. easy example down-weight, γ=2 기본.
|
||
- **Label smoothing**: y → y(1-ε) + ε/K. overconfidence 방지.
|
||
- **Hinge**: max(0, 1-y·ŷ). SVM. y∈{-1,+1}.
|
||
|
||
### 매 Metric Learning
|
||
- **Contrastive** (Hadsell 2006): pair. y·d² + (1-y)·max(0, m-d)².
|
||
- **Triplet**: max(0, d(a,p) - d(a,n) + margin).
|
||
- **InfoNCE / NT-Xent** (SimCLR): -log exp(sim+/τ) / Σ exp(sim/τ).
|
||
- **Cosine embedding**: 1 - cos(a,b).
|
||
|
||
### 매 Segmentation
|
||
- **Dice**: 1 - 2|A∩B|/(|A|+|B|). class imbalance 강함.
|
||
- **IoU/Jaccard**: 1 - |A∩B|/|A∪B|.
|
||
- **Tversky**: FP/FN weight 조정.
|
||
- **Boundary loss**: 거리변환 가중.
|
||
|
||
## 💻 패턴
|
||
### Regression losses
|
||
```python
|
||
import torch, torch.nn.functional as F
|
||
mse = F.mse_loss(pred, y)
|
||
mae = F.l1_loss(pred, y)
|
||
huber = F.huber_loss(pred, y, delta=1.0) # smooth_l1 ≈ huber(δ=1)
|
||
# Quantile
|
||
def quantile_loss(pred, y, tau=0.5):
|
||
e = y - pred
|
||
return torch.maximum(tau*e, (tau-1)*e).mean()
|
||
```
|
||
|
||
### Classification losses
|
||
```python
|
||
ce = F.cross_entropy(logits, y_int) # logits, not probs
|
||
bce = F.binary_cross_entropy_with_logits(logits, y_float)
|
||
# Label smoothing (built-in)
|
||
ce_ls = F.cross_entropy(logits, y_int, label_smoothing=0.1)
|
||
```
|
||
|
||
### Focal loss (이진)
|
||
```python
|
||
def focal_bce(logits, y, alpha=0.25, gamma=2.0):
|
||
p = torch.sigmoid(logits)
|
||
pt = torch.where(y == 1, p, 1 - p)
|
||
alpha_t = torch.where(y == 1, alpha, 1 - alpha)
|
||
return -(alpha_t * (1 - pt).pow(gamma) * pt.clamp_min(1e-8).log()).mean()
|
||
```
|
||
|
||
### Triplet & InfoNCE
|
||
```python
|
||
triplet = F.triplet_margin_loss(anchor, pos, neg, margin=1.0)
|
||
|
||
def info_nce(q, k_pos, k_neg, tau=0.07):
|
||
# q: (B,D), k_pos: (B,D), k_neg: (B,N,D)
|
||
pos = (q * k_pos).sum(-1, keepdim=True) / tau
|
||
neg = torch.einsum("bd,bnd->bn", q, k_neg) / tau
|
||
logits = torch.cat([pos, neg], dim=1)
|
||
target = torch.zeros(q.size(0), dtype=torch.long, device=q.device)
|
||
return F.cross_entropy(logits, target)
|
||
```
|
||
|
||
### Dice + BCE (segmentation 표준)
|
||
```python
|
||
def dice_loss(logits, y, eps=1e-6):
|
||
p = torch.sigmoid(logits)
|
||
inter = (p * y).sum(dim=(2, 3))
|
||
union = p.sum(dim=(2, 3)) + y.sum(dim=(2, 3))
|
||
return 1 - (2 * inter + eps) / (union + eps)
|
||
|
||
def combo_loss(logits, y):
|
||
return 0.5 * F.binary_cross_entropy_with_logits(logits, y) + dice_loss(logits, y).mean()
|
||
```
|
||
|
||
### Class imbalance 가중
|
||
```python
|
||
weights = torch.tensor([1.0, 5.0, 2.0]) # 클래스별
|
||
ce_w = F.cross_entropy(logits, y_int, weight=weights)
|
||
```
|
||
|
||
## 매 결정 기준
|
||
| Task | Default | 변형 |
|
||
|---|---|---|
|
||
| Regression normal | MSE | outlier→Huber, robust→MAE |
|
||
| Binary classification | BCE w/ logits | imbalance→Focal |
|
||
| Multi-class | CE w/ label smoothing | imbalance→class weights |
|
||
| Multi-label | BCE per-class | |
|
||
| Embedding learning | InfoNCE | small batch→Triplet |
|
||
| Segmentation | BCE+Dice | small object→Tversky |
|
||
| Object detection | Focal + IoU/GIoU | (RetinaNet, YOLO) |
|
||
|
||
**기본값**: classification CE+label smoothing 0.1, regression Huber.
|
||
|
||
## 🔗 Graph
|
||
- 부모: [[Optimization]]
|
||
- 변형: [[Focal-Loss]]
|
||
- 응용: [[Image-Classification-Mastery]], [[Segmentation]], [[Object-Detection]]
|
||
- Adjacent: [[Activation-Functions]], [[Class-Imbalance]], [[L1-and-L2-Regularization|Regularization]]
|
||
|
||
## 🤖 LLM 활용
|
||
**언제**: task→loss 매핑, gradient 직관, 코드 템플릿 생성.
|
||
**언제 X**: domain-specific custom loss 설계는 검증 필수 (분포·gradient 분석).
|
||
|
||
## ❌ 안티패턴
|
||
- `softmax` 후 `nll_loss` 손수 (numerical) ← `cross_entropy` 사용
|
||
- BCE에 `binary_cross_entropy(sigmoid(...))` ← `_with_logits` 사용
|
||
- Imbalance 무시한 CE
|
||
- Dice loss만 단독 (gradient 불안정) → BCE+Dice 혼합
|
||
- Focal γ를 imbalance 없을 때 사용 (성능↓)
|
||
|
||
## 🧪 검증 / 중복
|
||
- Verified (Goodfellow DL ch5, Lin 2017 Focal, SimCLR, Milletari V-Net Dice). 신뢰도 A.
|
||
- Canonical for [[Loss-Functions-Foundations|Loss Functions]] (redirect).
|
||
|
||
## 🕓 Changelog
|
||
| 날짜 | 변경 |
|
||
|---|---|
|
||
| 2026-05-08 | Phase 1 |
|
||
| 2026-05-10 | Manual cleanup — canonical 강화, segmentation/metric 추가 |
|