[G1-Sync] Manual knowledge update

2026-04-30 22:42:02 +09:00
parent 0bd4f19e38
commit c36c0644a1
4888 changed files with 18470 additions and 18602 deletions
@@ -2,11 +2,11 @@
 id: RL-EX-BAL-001
 category: "10_Wiki/💡 Topics/AI"
 confidence_score: 1.0
-tags: [reinforcement-learning, ai, decision-making, exploration, exploitation]
+tags: [[[Reinforcement-Learning]], ai, decision-making, exploration, exploitation]
 last_reinforced: 2026-04-26
 ---

-# Exploration vs Exploitation (탐색과 활용의 균형)
+# [[Exploration vs Exploitation]] (탐색과 활용의 균형)

 ## 📌 한 줄 통찰 (The Karpathy Summary)
 > "안전한 현재의 수익과 불확실한 미래의 가능성 사이에서 최적의 배팅 지점을 찾아라" — 강화학습의 핵심 딜레마로, 이미 알고 있는 최선의 행동을 반복하여 보상을 얻는 것(Exploitation)과 더 나은 행동을 찾기 위해 새로운 시도를 하는 것(Exploration) 사이의 트레이드오프.