[G1-Sync] Manual knowledge update

2026-04-30 22:42:02 +09:00
parent 0bd4f19e38
commit c36c0644a1
4888 changed files with 18470 additions and 18602 deletions
@@ -1,8 +1,8 @@
 ---
-id: P-REINFORCE-AI-BELLMAN
+id: [[P-Reinforce]]-AI-BELLMAN
 category: "10_Wiki/💡 Topics/AI"
 confidence_score: 1.0
-tags: [Bellman Equation, Reinforcement Learning, Dynamic Programming, MDP]
+tags: [[[Bellman Equation]], Reinforcement Learning, Dynamic Programming, MDP]
 last_reinforced: 2026-04-20
 ---

@@ -14,7 +14,7 @@ last_reinforced: 2026-04-20
 ## 📖 구조화된 지식 (Synthesized Content)
 - **Recursive Structure**:
    - 복잡한 미래의 합을 현재와 바로 다음 단계의 관계로 쪼갬으로써, 거대한 의사결정 문제를 계산 가능한 단위로 분해한다.
- **State-Value Function (V)**:
+- **[[State]]-Value Function (V)**:
    - 특정 상태에 있는 것이 장기적으로 볼 때 얼마나 좋은지 수치화한다.
 - **Action-Value Function (Q)**:
    - 특정 상태에서 특정 행동을 하는 것이 얼마나 좋은지 수치화하며, 이는 Q-Learning의 핵심이 된다.