[G1-Sync] Manual knowledge update

2026-04-30 22:42:02 +09:00
parent 0bd4f19e38
commit c36c0644a1
4888 changed files with 18470 additions and 18602 deletions
@@ -1,8 +1,8 @@
 ---
-id: P-REINFORCE-AUTO-BIVA-001
+id: [[P-Reinforce]]-AUTO-BIVA-001
 category: "10_Wiki/💡 Topics/AI"
 confidence_score: 1.00
-tags: [auto-reinforced, bias-variance, machine-learning-foundations, overfitting, underfitting, model-performance]
+tags: [auto-reinforced, bias-variance, [[Machine-Learning-Foundations]], [[Overfitting]], underfitting, model-performance]
 last_reinforced: 2026-04-20
 ---

@@ -24,7 +24,7 @@ last_reinforced: 2026-04-20

 ## ⚠️ 모순 및 업데이트 (Contradictions & RL Update)
 - **과거 데이터와의 충돌**: 과거에는 모델 매개변수가 많아지면 무조건 Variance가 커진다고 믿었으나(U-shape curve), 현대 거대 모델 정책은 매개변수가 임계치 이상으로 많아지면 오차가 오히려 다시 줄어드는 'Double Descent(이중 하강) 정책'을 발견하여 고전적 통계학 정책의 한계를 확장함(RL Update).
- **정책 변화(RL Update)**: 보상 함수 설계 정책에서, 모델의 분산을 줄이기 위해 데이터 증강(Augmentation)이나 규제화(Regularization)를 강제하는 '안정성 지향적 학습 정책'이 필수적으로 적용됨.
+- **정책 변화(RL Update)**: 보상 함수 설계 정책에서, 모델의 분산을 줄이기 위해 데이터 증강(Augmentation)이나 규제화([[Regularization]])를 강제하는 '안정성 지향적 학습 정책'이 필수적으로 적용됨.

 ## 🔗 지식 연결 (Graph)
 - [[Standardization vs Innovation]], [[stochastic gradient descent]], Foundational Models, Pattern Recognition, [[Stability vs Flexibility]]