Files
2nd/10_Wiki/Topics/AI_and_ML/Loss-Functions-Foundations.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

5.2 KiB
Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-loss-functions-foundations Loss Functions Foundations 10_Wiki/Topics verified self
Loss Functions
Cost Functions
Objective Functions
Loss-Functions
none A 0.95 applied
loss
objective
training
mse
cross-entropy
focal
contrastive
dice
2026-05-10 pending
language framework
Python PyTorch

Loss Functions Foundations

매 한 줄

"매 loss는 task가 정한다". Regression→MSE/MAE/Huber, Classification→CE/Focal, Metric→Contrastive/Triplet, Segmentation→Dice/IoU.

매 핵심

매 회귀 (Regression)

  • MSE (L2): ½(y-ŷ)². 미분 깔끔, outlier에 민감.
  • MAE (L1): |y-ŷ|. robust, 0에서 미분 불가.
  • Huber: |e|<δ면 MSE, 아니면 MAE. δ=1 기본.
  • Log-cosh: smooth Huber. 자동 미분 친화.
  • Quantile: max(τe, (τ-1)e). 중앙값/구간 예측.

매 분류 (Classification)

  • BCE: -[y log p + (1-y) log(1-p)]. 이진/다중라벨.
  • CE (softmax): -Σ y_k log p_k. 다중클래스.
  • Focal (Lin 2017): -α(1-p)^γ log p. easy example down-weight, γ=2 기본.
  • Label smoothing: y → y(1-ε) + ε/K. overconfidence 방지.
  • Hinge: max(0, 1-y·ŷ). SVM. y∈{-1,+1}.

매 Metric Learning

  • Contrastive (Hadsell 2006): pair. y·d² + (1-y)·max(0, m-d)².
  • Triplet: max(0, d(a,p) - d(a,n) + margin).
  • InfoNCE / NT-Xent (SimCLR): -log exp(sim+/τ) / Σ exp(sim/τ).
  • Cosine embedding: 1 - cos(a,b).

매 Segmentation

  • Dice: 1 - 2|A∩B|/(|A|+|B|). class imbalance 강함.
  • IoU/Jaccard: 1 - |A∩B|/|AB|.
  • Tversky: FP/FN weight 조정.
  • Boundary loss: 거리변환 가중.

💻 패턴

Regression losses

import torch, torch.nn.functional as F
mse = F.mse_loss(pred, y)
mae = F.l1_loss(pred, y)
huber = F.huber_loss(pred, y, delta=1.0)        # smooth_l1 ≈ huber(δ=1)
# Quantile
def quantile_loss(pred, y, tau=0.5):
    e = y - pred
    return torch.maximum(tau*e, (tau-1)*e).mean()

Classification losses

ce = F.cross_entropy(logits, y_int)               # logits, not probs
bce = F.binary_cross_entropy_with_logits(logits, y_float)
# Label smoothing (built-in)
ce_ls = F.cross_entropy(logits, y_int, label_smoothing=0.1)

Focal loss (이진)

def focal_bce(logits, y, alpha=0.25, gamma=2.0):
    p = torch.sigmoid(logits)
    pt = torch.where(y == 1, p, 1 - p)
    alpha_t = torch.where(y == 1, alpha, 1 - alpha)
    return -(alpha_t * (1 - pt).pow(gamma) * pt.clamp_min(1e-8).log()).mean()

Triplet & InfoNCE

triplet = F.triplet_margin_loss(anchor, pos, neg, margin=1.0)

def info_nce(q, k_pos, k_neg, tau=0.07):
    # q: (B,D), k_pos: (B,D), k_neg: (B,N,D)
    pos = (q * k_pos).sum(-1, keepdim=True) / tau
    neg = torch.einsum("bd,bnd->bn", q, k_neg) / tau
    logits = torch.cat([pos, neg], dim=1)
    target = torch.zeros(q.size(0), dtype=torch.long, device=q.device)
    return F.cross_entropy(logits, target)

Dice + BCE (segmentation 표준)

def dice_loss(logits, y, eps=1e-6):
    p = torch.sigmoid(logits)
    inter = (p * y).sum(dim=(2, 3))
    union = p.sum(dim=(2, 3)) + y.sum(dim=(2, 3))
    return 1 - (2 * inter + eps) / (union + eps)

def combo_loss(logits, y):
    return 0.5 * F.binary_cross_entropy_with_logits(logits, y) + dice_loss(logits, y).mean()

Class imbalance 가중

weights = torch.tensor([1.0, 5.0, 2.0])           # 클래스별
ce_w = F.cross_entropy(logits, y_int, weight=weights)

매 결정 기준

Task Default 변형
Regression normal MSE outlier→Huber, robust→MAE
Binary classification BCE w/ logits imbalance→Focal
Multi-class CE w/ label smoothing imbalance→class weights
Multi-label BCE per-class
Embedding learning InfoNCE small batch→Triplet
Segmentation BCE+Dice small object→Tversky
Object detection Focal + IoU/GIoU (RetinaNet, YOLO)

기본값: classification CE+label smoothing 0.1, regression Huber.

🔗 Graph

🤖 LLM 활용

언제: task→loss 매핑, gradient 직관, 코드 템플릿 생성. 언제 X: domain-specific custom loss 설계는 검증 필수 (분포·gradient 분석).

안티패턴

  • softmaxnll_loss 손수 (numerical) ← cross_entropy 사용
  • BCE에 binary_cross_entropy(sigmoid(...))_with_logits 사용
  • Imbalance 무시한 CE
  • Dice loss만 단독 (gradient 불안정) → BCE+Dice 혼합
  • Focal γ를 imbalance 없을 때 사용 (성능↓)

🧪 검증 / 중복

  • Verified (Goodfellow DL ch5, Lin 2017 Focal, SimCLR, Milletari V-Net Dice). 신뢰도 A.
  • Canonical for Loss-Functions-Foundations (redirect).

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — canonical 강화, segmentation/metric 추가