chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)

이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해 끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은 과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업. 도구: Datacollect/scripts/link_reconcile_apply.mjs Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00
parent 2ddf30f8e4
commit d8a80f6272
768 changed files with 1085 additions and 1085 deletions
@@ -132,7 +132,7 @@ def simulate_dopamine(trial, cue_time, reward_time, predicted=True):
 - 부모: [[Reinforcement Learning]]
 - 변형: [[TD Learning]] · [[Distributional RL]]
 - 응용: [[Actor-Critic]] · [[RLHF]]
- Adjacent: [[Dopamine]] · [[데이터_사이언스_및_ML_엔지니어링|Bellman Equation]]
+- Adjacent: [[Dopamine]] · [[데이터 사이언스 및 ML 엔지니어링|Bellman Equation]]

 ## 🤖 LLM 활용
 **언제**: RLHF/DPO/GRPO 의 advantage computation 의 understand, 의 reward model debugging.