--- id: wiki-2026-0508-out-of-distribution-detection title: Out-of-Distribution Detection category: 10_Wiki/Topics status: verified canonical_id: self aliases: [OOD-Detection, OOD, Anomaly-Detection-NN, Novelty-Detection] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [ood, safety, uncertainty, foundation-models, anomaly-detection] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: pytorch --- # Out-of-Distribution Detection ## 매 한 줄 > **"매 model 이 본 적 없는 input 의 거부"**. OOD detection 은 inference 시 input 이 training distribution 밖인지 판정하여 silent failure 를 막는 safety layer. 매 2026 의 표준은 foundation-model embedding 위 의 KNN / Mahalanobis 또는 logit-energy score, classical ODIN 은 baseline. ## 매 핵심 ### 매 score family 1. **Softmax baseline (MSP)**: max softmax probability — weak baseline. 2. **ODIN** (Liang 2018): temperature scaling + input gradient perturbation. 3. **Energy** (Liu 2020): `-T * logsumexp(logits / T)`, free, strong. 4. **Mahalanobis** (Lee 2018): class-conditional Gaussian on penultimate features. 5. **KNN** (Sun 2022): k-NN distance in feature space — 매 simple, robust. 6. **DOSE / VIM** (2022-2024): residual + logit hybrid. 7. **Foundation-model OOD** (CLIP, DINOv2 features + KNN) — 2026 SOTA. ### 매 evaluation - AUROC, FPR@95TPR, AUPR. - ID = CIFAR-10/ImageNet, OOD = SVHN, Textures, iNaturalist, Places, OpenOOD bench. - **near-OOD** (CIFAR10 vs CIFAR100) 가 매 어려운 case. ### 매 응용 1. autonomous driving 의 unknown object reject. 2. medical imaging 의 unsupported modality flag. 3. LLM 의 jailbreak / off-distribution prompt detection. 4. fraud detection 의 novel attack pattern. ## 💻 패턴 ### Energy score (Liu 2020) ```python import torch, torch.nn.functional as F @torch.no_grad() def energy_score(model, x, T=1.0): logits = model(x) # higher energy = OOD return -T * torch.logsumexp(logits / T, dim=-1) ``` ### MSP baseline ```python @torch.no_grad() def msp(model, x): return -F.softmax(model(x), dim=-1).max(-1).values ``` ### Mahalanobis on features ```python @torch.no_grad() def fit_mahalanobis(features, labels, num_classes): means = [] for c in range(num_classes): means.append(features[labels == c].mean(0)) means = torch.stack(means) centered = features - means[labels] cov = centered.T @ centered / len(features) inv = torch.linalg.pinv(cov) return means, inv def maha_score(f, means, inv): diffs = f.unsqueeze(1) - means # [N, C, D] d2 = torch.einsum("ncd,de,nce->nc", diffs, inv, diffs) return d2.min(-1).values # smallest distance to any class ``` ### KNN OOD (Sun 2022) ```python import torch, torch.nn.functional as F class KNNOOD: def __init__(self, k=50): self.k = k def fit(self, train_feats): self.bank = F.normalize(train_feats, dim=-1) def score(self, feats): f = F.normalize(feats, dim=-1) sim = f @ self.bank.T # cosine topk = sim.topk(self.k, dim=-1).values return -topk[:, -1] # negative k-th similarity → OOD score ``` ### ODIN ```python def odin_score(model, x, T=1000, eps=0.0014): x = x.clone().detach().requires_grad_(True) logits = model(x) / T p = F.softmax(logits, dim=-1).max(-1).values p.sum().backward() x_adv = x - eps * x.grad.sign() with torch.no_grad(): return F.softmax(model(x_adv) / T, dim=-1).max(-1).values ``` ### Foundation-model OOD (DINOv2 + KNN) ```python import torch dino = torch.hub.load("facebookresearch/dinov2", "dinov2_vitb14").eval().cuda() @torch.no_grad() def feats(x): return dino(x) # [B, 768] knn = KNNOOD(k=50) knn.fit(feats(train_loader_id)) ood_scores = knn.score(feats(test_batch)) ``` ### LLM OOD via embedding ```python from sentence_transformers import SentenceTransformer emb = SentenceTransformer("BAAI/bge-large-en-v1.5") id_bank = emb.encode(in_dist_prompts, normalize_embeddings=True) def prompt_ood(prompt, k=20): q = emb.encode([prompt], normalize_embeddings=True) sims = (q @ id_bank.T)[0] return -sims.topk(k).values.min() ``` ### Threshold calibration (FPR@95TPR) ```python import numpy as np def threshold_at_tpr(scores_id, tpr=0.95): return np.quantile(scores_id, 1 - tpr) ``` ## 매 결정 기준 | 상황 | Method | |---|---| | 매 simple, 즉시 | Energy | | 매 best AUROC | KNN on foundation features | | 매 access to features only | Mahalanobis | | 매 CV with strong backbone | DINOv2 + KNN | | 매 LLM input filter | embedding KNN + threshold | | 매 production, low-latency | Energy or MSP | **기본값**: foundation embedding + KNN (k=50). ## 🔗 Graph - 부모: [[Anomaly-Detection]] - 변형: [[KNN]] - 응용: [[Active-Learning]] ## 🤖 LLM 활용 **언제**: 매 high-stakes deployment, jailbreak filter, novel-prompt routing. **언제 X**: 매 closed-world benchmark — distribution 가 fixed 인 경우 overhead. ## ❌ 안티패턴 - **MSP only**: 매 over-confident network 에서 거의 무력. - **Train OOD detector on test OOD set**: leakage, false confidence. - **Threshold from training scores**: ID validation set 에서 calibrate. - **Ignore near-OOD**: far-OOD AUROC 99% 인데 near-OOD 60% 인 흔한 함정. - **Foundation-model embedding mismatch**: ImageNet-pretrained 으로 medical OOD detect. ## 🧪 검증 / 중복 - Verified (OpenOOD benchmark 2024, Sun 2022 KNN, Liu 2020 Energy). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — Energy/Maha/KNN + foundation-model OOD |