--- id: P-REINFORCE-AUTO-SEST-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 0.95 tags: [auto-reinforced, search-strategy, focus-search, heuristic, algorithm, exploration-exploitation] last_reinforced: 2026-04-20 --- # [[Search-Strategy|Search-Strategy]] ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "νƒν—˜μ˜ μ „μˆ : λ¬΄μž‘μ • 덀벼듀지 μ•Šκ³ , 이미 μ•„λŠ” 쒋은 곳을 더 깊게 νŒ”μ§€(Exploitation), μ•„λ‹ˆλ©΄ μƒˆλ‘œμš΄ 곳을 μ°Ύμ•„ λ– λ‚ μ§€(Exploration) μ‚¬μ΄μ—μ„œ μ™„λ²½ν•œ κ· ν˜•μ„ 작으며 λͺ©ν‘œλ₯Ό νƒ€κ²©ν•˜λŠ” μ§€λŠ₯적 행동 λ°©μΉ¨." ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) 탐색 μ „λž΅(Search-Strategy)은 μ£Όμ–΄μ§„ 탐색 κ³΅κ°„μ—μ„œ λͺ©ν‘œλ₯Ό κ°€μž₯ 효과적으둜 λ‹¬μ„±ν•˜κΈ° μœ„ν•΄ μ„ νƒν•˜λŠ” ꡬ체적인 λ°©λ²•λ‘ μž…λ‹ˆλ‹€. 1. **λŒ€ν‘œμ  μ „λž΅ 도ꡬ**: * **BFS (Breadth-First)**: λ„“κ³  μ–•κ²Œ ν›‘μŒ (μ•ˆμ •μ„±). * **DFS (Depth-First)**: ν•œ 우물만 깊게 νŒŒλ΄„ (속도). * **Heuristic Search**: κ²½ν—˜μ  힌트λ₯Ό μ‚¬μš©ν•΄ 정닡에 κ°€κΉŒμš΄ κ³³λΆ€ν„° 뒀짐 (A* μ•Œκ³ λ¦¬μ¦˜ λ“±). (Optimization와 μ—°κ²°) 2. **핡심 λ”œλ ˆλ§ˆ (Exploration vs Exploitation)**: * μƒˆλ‘œμš΄ κ°€λŠ₯성을 찾을 것인가, μ•„λ‹ˆλ©΄ κ²€μ¦λœ μ΅œκ³ μ μ„ 닀듬을 것인가? (Reinforcement Learning (RL)의 μ˜μ›ν•œ μˆ™μ œ). 3. **μ™œ μ€‘μš”ν•œκ°€?**: * ν›Œλ₯­ν•œ μ „λž΅μ€ 수만 λ…„ 걸릴 탐색 μ‹œκ°„μ„ 단 λͺ‡ λΆ„μœΌλ‘œ 쀄여주며, μ‹œμŠ€ν…œμ˜ 'λ°˜μ‘ 속도'와 '정확도' μ‚¬μ΄μ˜ 졜적점(Sweet spot)을 κ²°μ •ν•˜κΈ° λ•Œλ¬Έμž„. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌**: κ³Όκ±°μ—λŠ” κ³ μ •λœ κ·œμΉ™(Static strategy) μ •μ±…μ΄μ—ˆμœΌλ‚˜, ν˜„λŒ€ 정책은 탐색 결과에 따라 μ „λž΅μ„ μ‹€μ‹œκ°„μœΌλ‘œ λ°”κΎΈλŠ” 'μ μ‘ν˜• 탐색 μ •μ±…'이 μ£Όλ₯˜κ°€ 됨(RL Update). - **μ •μ±… λ³€ν™”(RL Update)**: λ³Έ 지식 베이슀 ꡬ좕 μ •μ±…μ—μ„œλ„, λŒ€ν‘œλ‹˜μ˜ ν”Όλ“œλ°± 정책에 따라 νŠΉμ • 주제 정책을 더 깊게 νŒ”μ§€(Deep-dive), μ•„λ‹ˆλ©΄ 일단 전체 개수 정책을 μ±„μšΈμ§€(Breadth)λ₯Ό μ‘°μ ˆν•˜λŠ” 것이 κ³ μˆ˜μ€€μ˜ 탐색 μ „λž΅ μ •μ±…μž„. ## πŸ”— 지식 μ—°κ²° (Graph) - [[Optimization|Optimization]], [[Reinforcement Learning (RL)|Reinforcement Learning (RL)]], [[Efficiency|Efficiency]], [[Search-Space|Search-Space]], [[Mastery|Mastery]] - **Modern Tech/Tools**: A* algorithm, Greedy search, Beam search, Monte Carlo Tree Search. ---