[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -1,63 +1,162 @@
 ---
 id: wiki-2026-0508-precision-recall-tradeoff
-title: Precision Recall Tradeoff
+title: Precision-Recall Tradeoff
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [AI-MET-TRAD-001]
+aliases: [PR Tradeoff, Threshold Tuning, F1 Optimization]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 1.0
-tags: [ai, machine-learning, metrics, precision, recall, tradeoff, f1-score, threshold]
+confidence_score: 0.9
+verification_status: applied
+tags: [classification, evaluation, metrics, threshold, imbalanced]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
+tech_stack:
+  language: python
+  framework: scikit-learn
 ---

-# Precision-Recall Tradeoff (정밀도-재현율 트레이드오프)
+# Precision-Recall Tradeoff

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "오답을 내지 않는 '신중함'과 정답을 놓치지 않는 '집요함' 사이에서, 비즈니스의 가치를 극대화하는 중도의 임계점을 사수하라" — 정밀도와 재현율 중 하나를 높이면 다른 하나는 낮아지는 상충 관계를 이해하고, 문제의 성격에 따라 최적의 균형점을 선택하는 전략적 의사결정 원리.
+## 매 한 줄
+> **"매 classifier threshold를 올리면 precision↑ recall↓ — 두 metric 동시에 최대화 불가."**. F1 / F-beta / PR-AUC 가 매 두 축의 통합 score. 매 imbalanced data (의료, fraud, anomaly)에서 ROC-AUC보다 매 PR-AUC가 honest.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** "Decision Threshold and Performance Balancing" — 모델의 분류 임계값(Threshold)을 높이면 확실한 것만 정답이라 하므로 정밀도가 올라가지만(신중), 임계값을 낮추면 더 많은 가능성을 정답으로 포함시켜 재현율이 올라가는(집요) 상보적 패턴.
- **주요 고려 사항:**
-    - **High Precision Priority:** 스팸 메일 분류처럼 오탐(FP)의 피해가 클 때 선택.
-    - **High Recall Priority:** 암 진단이나 사기 탐지처럼 미탐(FN)의 위험이 치명적일 때 선택.
-    - **F1-Score:** 두 지표의 조화 평균으로, 어느 한쪽에 치우치지 않는 균형 잡린 성능 평가.
- **의의:** 100% 완벽한 모델은 존재하지 않음을 인정하고, 한정된 자원 내에서 '틀렸을 때의 비용'을 최소화하는 공학적 최적화를 가능케 함.
+## 매 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 단순히 두 지표가 높을수록 좋다는 단편적 시각에서 벗어나, 이제는 PR 곡선(Precision-Recall Curve)의 면적(AUC)을 분석하여 모델의 전체적인 변별력을 다각도로 검증함.
- **정책 변화:** Antigravity 프로젝트는 에이전트의 지식 추출 작업 시, 잘못된 지식을 포함하지 않기 위해 정밀도(Precision)를 우선시하는 임계값 설정을 기본으로 하되, 탐색 모드에서는 재현율을 높여 더 많은 연결 고리를 찾도록 가변적으로 운용함.
+### 매 정의
+- **Precision** = TP / (TP + FP) — "alarm 중 진짜 비율".
+- **Recall** = TP / (TP + FN) — "진짜 중 잡은 비율" (= sensitivity, TPR).
+- **F1** = 2·P·R / (P+R) — harmonic mean.
+- **F-beta** = (1+β²)·P·R / (β²P + R) — β>1은 recall 가중, β<1은 precision 가중.

-## 🔗 지식 연결 (Graph)
- [[Performance-Metrics-in-AI|Performance-Metrics-in-AI]], [[Imbalanced-Data-Handling|Imbalanced-Data-Handling]], [[Loss-Functions-Foundations|Loss-Functions-Foundations]], [[Exploratory-Data-Analysis|Exploratory-Data-Analysis]]
- **Raw Source:** 10_Wiki/Topics/AI/Precision-Recall-Tradeoff.md
+### 매 tradeoff mechanism
+- Classifier output score에 **threshold τ** 적용.
+- τ ↑ → 더 까다롭게 positive 선언 → precision ↑, recall ↓.
+- τ ↓ → 더 많이 positive → recall ↑, precision ↓.
+- Pareto curve = Precision-Recall curve.

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+### 매 vs ROC-AUC
+- **ROC**: TPR vs FPR — class balance에 매 insensitive (오해 유발).
+- **PR**: P vs R — positive class에 focus, imbalanced 에 매 informative.
+- 매 99% 음성 dataset에서 매 ROC-AUC=0.95여도 PR-AUC=0.3일 수 있음.

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+### 매 응용
+1. 의료 진단 (recall 우선 — miss 위험).
+2. Spam filter (precision 우선 — false alarm 비용).
+3. Fraud detection (cost-sensitive threshold).
+4. Information retrieval (P@k, R@k).
+5. Object detection (mAP = PR-AUC 기반).
+6. RAG retrieval evaluation.

-**언제 쓰면 안 되는가:**
- *(TODO)*
+## 💻 패턴

-## 🧪 검증 상태 (Validation)
+### sklearn PR curve + best F1 threshold
+```python
+import numpy as np
+from sklearn.metrics import precision_recall_curve, average_precision_score

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+probs = clf.predict_proba(X_val)[:, 1]
+p, r, thr = precision_recall_curve(y_val, probs)
+f1 = 2 * p * r / (p + r + 1e-12)
+best = f1.argmax()
+print(f"τ={thr[best]:.3f}  P={p[best]:.3f}  R={r[best]:.3f}  F1={f1[best]:.3f}")
+print(f"PR-AUC = {average_precision_score(y_val, probs):.3f}")
+```

-## 🧬 중복 검사 (Duplicate Check)
+### F-beta threshold (recall 가중)
+```python
+def best_fbeta_threshold(y, probs, beta=2.0):
+    p, r, thr = precision_recall_curve(y, probs)
+    fb = (1+beta**2) * p * r / (beta**2 * p + r + 1e-12)
+    i = fb.argmax()
+    return thr[i] if i < len(thr) else 1.0, fb[i]
+```

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+### Cost-based threshold
+```python
+def cost_threshold(y, probs, cost_fp=1.0, cost_fn=10.0, n_thr=200):
+    thrs = np.linspace(0, 1, n_thr)
+    best_t, best_c = 0.0, np.inf
+    for t in thrs:
+        pred = (probs >= t).astype(int)
+        fp = ((pred == 1) & (y == 0)).sum()
+        fn = ((pred == 0) & (y == 1)).sum()
+        c = cost_fp*fp + cost_fn*fn
+        if c < best_c: best_c, best_t = c, t
+    return best_t, best_c
+```

-## 🕓 변경 이력 (Changelog)
+### Plot PR curve
+```python
+import matplotlib.pyplot as plt
+plt.plot(r, p, label=f'AP={average_precision_score(y_val, probs):.3f}')
+plt.xlabel('Recall'); plt.ylabel('Precision'); plt.legend()
+```

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+### PR vs ROC on imbalanced
+```python
+from sklearn.metrics import roc_auc_score, average_precision_score
+# y는 imbalanced (1% positive)
+print('ROC-AUC :', roc_auc_score(y_val, probs))      # 매 inflated
+print('PR-AUC  :', average_precision_score(y_val, probs))  # 매 honest
+```
+
+### Calibration before threshold tuning
+```python
+from sklearn.calibration import CalibratedClassifierCV
+cal = CalibratedClassifierCV(clf, method='isotonic', cv=5)
+cal.fit(X_train, y_train)
+probs_cal = cal.predict_proba(X_val)[:, 1]
+# 매 calibrated probability — threshold 의미 직관적
+```
+
+### Per-class threshold (multi-label)
+```python
+def tune_per_label(y_true, probs):  # (N, L)
+    L = probs.shape[1]
+    thrs = np.zeros(L)
+    for k in range(L):
+        thrs[k], _ = best_fbeta_threshold(y_true[:, k], probs[:, k], beta=1.0)
+    return thrs
+```
+
+## 매 결정 기준
+| 상황 | Threshold 우선 |
+|---|---|
+| 의료 screening (놓치면 위험) | High recall (low τ), F2 |
+| Spam / 광고 차단 (오차단 곤란) | High precision (high τ), F0.5 |
+| Balanced cost | F1 maximize |
+| 명시적 cost ratio 있음 | Cost-based threshold |
+| 이미 imbalanced + 비교 | PR-AUC report (ROC 보조) |
+| Multi-label | Per-label threshold tune |
+
+**기본값**: F1 maximize on validation set + calibration. Imbalanced면 PR-AUC report.
+
+## 🔗 Graph
+- 부모: [[Classification]] · [[Evaluation_Metrics]]
+- 변형: [[F1_Score]] · [[F-beta_Score]] · [[PR_AUC]] · [[ROC_AUC]]
+- 응용: [[Imbalanced_Data]] · [[Anomaly_Detection]] · [[Object_Detection_mAP]] · [[Information_Retrieval]]
+- Adjacent: [[Confusion_Matrix]] · [[Threshold_Tuning]] · [[Calibration]] · [[Cost_Sensitive_Learning]]
+
+## 🤖 LLM 활용
+**언제**: classifier deployment threshold 결정, imbalanced eval report, cost-sensitive decision.
+**언제 X**: 매 ranking task에서 매 단일 threshold 무의미 — top-k metric 또는 nDCG 사용.
+
+## ❌ 안티패턴
+- **Default 0.5 threshold without tuning**: 매 imbalanced 에서 매 useless. Validation tune 필수.
+- **Tune threshold on test set**: 매 leak. Validation 만.
+- **ROC-AUC only on imbalanced**: 매 inflated 결과 — PR-AUC 동반.
+- **Ignore calibration before threshold**: 매 uncalibrated probability 의 threshold 매 transferable X.
+- **F1 maximize when costs are asymmetric**: 매 cost ratio 있으면 F-beta 또는 explicit cost.
+
+## 🧪 검증 / 중복
+- Verified (sklearn metrics docs precision_recall_curve, Saito & Rehmsmeier 2015 'PR vs ROC for imbalanced').
+- 신뢰도 A.
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — PR tradeoff math + threshold tuning + PR vs ROC on imbalanced |