--- id: P-REINFORCE-AUTO-OPCO-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 0.95 tags: [auto-reinforced, opportunity-cost, economics, decision-making, tradeoffs, resource-allocation] last_reinforced: 2026-04-20 --- # [[Opportunity-Cost|Opportunity-Cost]] ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "μ„ νƒλ˜μ§€ λͺ»ν•œ μ΅œμ„ μ˜ κ°€μΉ˜: ν•˜λ‚˜λ₯Ό μ–»κΈ° μœ„ν•΄ μš°λ¦¬κ°€ λ°˜λ“œμ‹œ 포기해야 ν–ˆλ˜ 'λ‹€λ₯Έ 것'의 κ°€μΉ˜μ΄μž, λͺ¨λ“  경제적 ν–‰μœ„μ™€ μ˜μ‚¬κ²°μ • 이면에 숨겨져 μžˆλŠ” 'μ§„μ§œ λΉ„μš©'을 μΌκΉ¨μ›Œμ£ΌλŠ” μ°¨κ°€μš΄ μ΄μ„±μ˜ μž£λŒ€." ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) κΈ°νšŒλΉ„μš©(Opportunity-Cost)은 μ—¬λŸ¬ 선택지 쀑 ν•˜λ‚˜λ₯Ό νƒν–ˆμ„ λ•Œ, ν¬κΈ°ν•œ λ‚˜λ¨Έμ§€ 쀑 κ°€μž₯ κ°€μΉ˜ μžˆλŠ” κ²ƒμ˜ κ°€μΉ˜λ₯Ό μ˜λ―Έν•©λ‹ˆλ‹€. 1. **계산 곡식**: κΈ°νšŒλΉ„μš© = λͺ…μ‹œμ  λΉ„μš©(심리적/κΈˆμ „μ  μ§€μΆœ) + 암묡적 λΉ„μš©(ν¬κΈ°ν•œ 잠재적 이읡). 2. **μ™œ μ€‘μš”ν•œκ°€?**: * 세상에 곡짜 점심은 μ—†μœΌλ©°(Trade-offs), ν˜„μž¬μ˜ 행동이 μ΅œμ„ μΈμ§€λ₯Ό νŒλ‹¨ν•˜λ €λ©΄ λ‹¨μˆœνžˆ λ“€μ–΄κ°„ λΉ„μš©μ΄ μ•„λ‹ˆλΌ 'ν¬κΈ°ν•œ κ°€λŠ₯μ„±'κ³Ό 비ꡐ해야 ν•˜κΈ° λ•Œλ¬Έμž„. (Judgment와 μ—°κ²°) ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌**: κ³Όκ±°μ—λŠ” λˆˆμ— λ³΄μ΄λŠ” νšŒκ³„μ  μ§€μΆœ μ •μ±…λ§Œ μ€‘μ‹œν–ˆμœΌλ‚˜, ν˜„λŒ€ 정책은 보이지 μ•ŠλŠ” 'μ‹œκ°„μ˜ κ°€μΉ˜ μ •μ±…'κ³Ό 'μ„±μž₯ 기회의 κ°€μΉ˜ μ •μ±…'을 κΈ°νšŒλΉ„μš© μ •μ±…μ˜ ν•΅μ‹¬μœΌλ‘œ λ΄„(RL Update). - **μ •μ±… λ³€ν™”(RL Update)**: λ¨Έμ‹ λŸ¬λ‹μ˜ 탐색(Exploration)κ³Ό ν™œμš©(Exploitation) λ”œλ ˆλ§ˆ μ •μ±…μ—μ„œ, μƒˆλ‘œμš΄ μ‹œλ„λ₯Ό ν•˜μ§€ μ•Šμ„ λ•Œμ˜ κΈ°νšŒλΉ„μš©(Regret)을 μ΅œμ†Œν™”ν•˜λŠ” μ „λž΅ 정책이 μ•Œκ³ λ¦¬μ¦˜ μ„€κ³„μ˜ ν‘œμ€€ 정책이 됨. (Reinforcement Learning (RL)와 μ—°κ²°) ## πŸ”— 지식 μ—°κ²° (Graph) - [[Judgment|Judgment]], [[Economic-Analysis|Economic-Analysis]], [[Reinforcement Learning (RL)|Reinforcement Learning (RL)]], [[Decision Theory|Decision Theory]], [[Efficiency|Efficiency]] - **Modern Tech/Tools**: Cost-benefit analysis, Multi-armed bandit (MAB) algorithms, Portfolio optimization. ---