--- id: P-REINFORCE-AI-DOPAMINE category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 0.97 tags: [Neuroscience, Psychology, RewardSystem, Dopamine] last_reinforced: 2026-04-20 --- # [[Dopaminergic Reward System|Dopaminergic Reward System]] (λ„νŒŒλ―Έλ„ˆμ§ 보상 체계) ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "쾌락의 호λ₯΄λͺ¬μ΄ μ•„λ‹ˆλΌ, 'κΈ°λŒ€'와 'ν•™μŠ΅'의 μ—”μ§„." λ‡Œμ—μ„œ μ˜ˆμƒμΉ˜ λͺ»ν•œ 보상을 λ°›μ•˜μ„ λ•Œ λΆ„λΉ„λ˜μ–΄ κ·Έ 행동을 λ°˜λ³΅ν•˜κ²Œ λ§Œλ“œλŠ” κ°•λ ₯ν•œ κ°•ν™”ν•™μŠ΅(Reinforcement Learning) μ‹œμŠ€ν…œμ˜ 생물학적 기원이닀. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) - **Reward Prediction Error (RPE)**: λ„νŒŒλ―Όμ€ 보상을 받을 λ•Œλ³΄λ‹€ 'μ˜ˆμƒλ³΄λ‹€ 더 쒋은 κ²°κ³Ό'κ°€ λ‚˜μ™”μ„ λ•Œ κ°€μž₯ 많이 λΆ„λΉ„λœλ‹€. (κΈ°λŒ€μΉ˜μ™€ μ‹€μ œμ˜ 차이가 ν•™μŠ΅μ˜ μ‹ ν˜Έκ°€ 됨) - **Core Pathways**: - **Mesolimbic Pathway**: 동기 λΆ€μ—¬ 및 쀑독과 κ΄€λ ¨ (볡츑 ν”Όκ°œ μ˜μ—­ $\to$ μΈ‘μ’Œν•΅). - **Mesocortical Pathway**: 인지 μ œμ–΄ 및 μ˜μ‚¬κ²°μ •κ³Ό κ΄€λ ¨ (전전두엽 μ—°κ²°). - **Function**: μ–΄λ–€ 행동이 생쑴에 μœ λ¦¬ν•œμ§€ λ‡Œμ— κ°μΈμ‹œν‚€κ³ , '주의(Attention)'λ₯Ό μ§‘μ€‘μ‹œν‚€λŠ” ν•„ν„° 역할을 함. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (RL Update) - λ„νŒŒλ―Ό μ‹œμŠ€ν…œμ΄ κ³ μž₯ λ‚˜λ©΄ λŠμž„μ—†λŠ” μžκ·Ήμ„ μ«“λŠ” 'λ„νŒŒλ―Ό 루프'에 λΉ μ§€κ²Œ λœλ‹€ (SNS, 도박, κ²Œμž„ μ€‘λ…μ˜ λ©”μ»€λ‹ˆμ¦˜). ν˜„λŒ€μ˜ λ””μ§€ν„Έ μ„œλΉ„μŠ€ λ””μžμΈμ€ 이 보상 체계λ₯Ό μ •λ°€ν•˜κ²Œ ν•΄ν‚Ή(Dark Patterns)ν•˜κ³  μžˆμœΌλ―€λ‘œ, 이λ₯Ό μΈμ§€ν•˜κ³  'μ˜λ„μ μΈ 결핍'을 톡해 수용체 민감도λ₯Ό νšŒλ³΅ν•˜λŠ” 'λ„νŒŒλ―Ό λ””ν†‘μŠ€'κ°€ μ •μ‹  κ±΄κ°•μ˜ ν™”λ‘λ‘œ λ– μ˜€λ₯΄κ³  μžˆλ‹€. ## πŸ”— 지식 μ—°κ²° (Graph) - Related: [[Reward Prediction Error|Reward Prediction Error]] , [[Flow-State|Flow-State]] - Mechanism: [[Reinforcement Learning (RL)|Reinforcement Learning (RL)]]