--- id: AI-METRICS-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 1.0 tags: [ai, machine-learning, performance-metrics, accuracy, f1-score, precision, recall, roc-auc] last_reinforced: 2026-04-26 --- # Performance Metrics in AI (AI μ„±λŠ₯ μ§€ν‘œ) ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "λ‹¨μˆœν•œ μ •ν™•λ„μ˜ ν™˜μƒμ— λΉ μ§€μ§€ 말고, 문제의 λ³Έμ§ˆμ— λΆ€ν•©ν•˜λŠ” μ •κ΅ν•œ 눈금자둜 μ§€λŠ₯의 μ‹€λ ₯을 μ‹¬νŒν•˜λΌ" β€” λͺ¨λΈμ˜ 예츑 κ²°κ³Όλ₯Ό μ •λŸ‰μ μœΌλ‘œ ν‰κ°€ν•˜μ—¬ ν•™μŠ΅μ˜ λ°©ν–₯을 μ„€μ •ν•˜κ³  λΉ„μ¦ˆλ‹ˆμŠ€ κ°€μΉ˜λ₯Ό κ²€μ¦ν•˜λŠ” 톡계적 μ§€ν‘œλ“€. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) - **μΆ”μΆœλœ νŒ¨ν„΄:** "Error Analysis and Confusion Matrix" β€” 정닡을 맞힌 κ²ƒλΏλ§Œ μ•„λ‹ˆλΌ μ–΄λ–€ μ˜€λ‹΅μ„ λƒˆλŠ”μ§€(FP, FN)λ₯Ό λΆ„μ„ν•˜μ—¬, λͺ¨λΈμ΄ νŠΉμ • ν΄λž˜μŠ€μ— 편ν–₯λ˜μ–΄ μžˆμ§€λŠ” μ•Šμ€μ§€, ν˜Ήμ€ 치λͺ…적인 μ‹€μˆ˜λ₯Ό λ²”ν•˜κ³  μžˆμ§€λŠ” μ•Šμ€μ§€ νŒŒμ•…ν•˜λŠ” νŒ¨ν„΄. - **μ£Όμš” μ§€ν‘œ λΆ„λ₯˜:** - **Classification:** Accuracy(정확도), Precision(정밀도), Recall(μž¬ν˜„μœ¨), F1-score(μ‘°ν™” 평균), ROC-AUC. - **Regression:** RMSE(평균 제곱근 였차), MAE(평균 μ ˆλŒ€ 였차), R-squared. - **NLP:** BLEU, ROUGE (μƒμ„±λœ λ¬Έμž₯κ³Ό μ •λ‹΅ λ¬Έμž₯의 κ²ΉμΉ¨ μΈ‘μ •). - **Ranking:** NDCG, MRR (검색 결과의 μˆœμœ„ 정확도). - **의의:** μ•” 진단(Recall이 μ€‘μš”)μ΄λ‚˜ 슀팸 메일 λΆ„λ₯˜(Precision이 μ€‘μš”)처럼 μ„œλΉ„μŠ€μ˜ 성격에 따라 μ΅œμš°μ„ μœΌλ‘œ 관리해야 ν•  μ§€ν‘œλ₯Ό κ²°μ •ν•˜λŠ” μ „λž΅μ  νŒλ‹¨ κ·Όκ±°. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌:** 정확도 99%κ°€ 무쑰건 μ΅œκ³ λΌλŠ” μΈμ‹μ—μ„œ λ²—μ–΄λ‚˜, μ΄μ œλŠ” 데이터 λΆˆκ· ν˜• μƒν™©μ—μ„œμ˜ μ„±λŠ₯μ΄λ‚˜ λͺ¨λΈμ˜ '곡정성(Fairness)', 'μ„€λͺ… κ°€λŠ₯μ„±' μ§€ν‘œκΉŒμ§€ ν¬ν•¨ν•˜λŠ” μž…μ²΄μ  평가가 강쑰됨. - **μ •μ±… λ³€ν™”:** Antigravity ν”„λ‘œμ νŠΈλŠ” μ—μ΄μ „νŠΈμ˜ μž‘μ—… 성곡λ₯ μ„ μΈ‘μ •ν•  λ•Œ, λ‹¨μˆœ 성곡/μ‹€νŒ¨ μ—¬λΆ€λΏλ§Œ μ•„λ‹ˆλΌ μ†Œμš” μ‹œκ°„, 토큰 νš¨μœ¨μ„±, μ‚¬μš©μž λ§Œμ‘±λ„ 점수λ₯Ό κ°€μ€‘μΉ˜λ‘œ λ‘” μ»€μŠ€ν…€ 볡합 μ§€ν‘œ(AG-Score)λ₯Ό μ‚¬μš©ν•¨. ## πŸ”— 지식 μ—°κ²° (Graph) - [[Imbalanced-Data-Handling|Imbalanced-Data-Handling]], [[Loss-Functions-Foundations|Loss-Functions-Foundations]], Cross-Validation-Techniques, [[Exploratory-Data-Analysis|Exploratory-Data-Analysis]] - **Raw Source:** 10_Wiki/Topics/AI/Performance-Metrics-in-AI.md