--- id: ALGO-MCTS-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 1.0 tags: [algorithm, ai, search, mcts, alphago, reinforcement-learning, game-theory] last_reinforced: 2026-04-26 --- # Monte Carlo Tree Search (MCTS, ๋ชฌํ…Œ์นด๋ฅผ๋กœ ํŠธ๋ฆฌ ํƒ์ƒ‰) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๋ชจ๋“  ๊ฐ€๋Šฅ์„ฑ์„ ๋’ค์ง€๋Š” ๋Œ€์‹ , ์Šน์‚ฐ ์žˆ๋Š” ๊ธธ์„ ๋ฌด์ž‘์œ„๋กœ ๋๊นŒ์ง€ ๊ฐ€๋ณด๊ณ  ์ตœ์„ ์˜ ์„ ํƒ์ง€๋ฅผ ์—ญ์œผ๋กœ ์ถ”์ ํ•˜๋ผ" โ€” ๋ฐฉ๋Œ€ํ•œ ํƒ์ƒ‰ ๊ณต๊ฐ„์—์„œ ์œ ๋งํ•œ ๊ฒฝ๋กœ๋ฅผ ์„ ํƒํ•˜๊ณ  ๋ฌด์ž‘์œ„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•ด ๊ฐ€์น˜๋ฅผ ํ‰๊ฐ€ํ•˜์—ฌ ์ตœ์ ์˜ ์˜์‚ฌ๊ฒฐ์ •์„ ๋‚ด๋ฆฌ๋Š” ์ง€๋Šฅํ˜• ํƒ์ƒ‰ ์•Œ๊ณ ๋ฆฌ์ฆ˜. ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **์ถ”์ถœ๋œ ํŒจํ„ด:** "Exploitation vs Exploration in Search" โ€” ์ด๋ฏธ ๊ฒ€์ฆ๋œ ์ข‹์€ ์ˆ˜(Exploitation)์™€ ์•„์ง ๊ฐ€๋ณด์ง€ ์•Š์€ ์ƒˆ๋กœ์šด ๊ฐ€๋Šฅ์„ฑ(Exploration) ์‚ฌ์ด์˜ ๊ท ํ˜•์„ UCB1 ์ˆ˜์‹์„ ํ†ตํ•ด ์กฐ์ ˆํ•˜๋ฉฐ ํŠธ๋ฆฌ๋ฅผ ํ™•์žฅํ•ด ๋‚˜๊ฐ€๋Š” ์ง€๋Šฅํ˜• ํƒ์ƒ‰ ํŒจํ„ด. - **4๋‹จ๊ณ„ ํ”„๋กœ์„ธ์Šค:** - **Selection:** ๋ฃจํŠธ์—์„œ ์‹œ์ž‘ํ•˜์—ฌ UCB1 ๊ฐ’์ด ๊ฐ€์žฅ ๋†’์€ ์ž์‹ ๋…ธ๋“œ๋ฅผ ๋”ฐ๋ผ ๋‚ด๋ ค๊ฐ. - **Expansion:** ํƒ์ƒ‰๋˜์ง€ ์•Š์€ ์ƒˆ๋กœ์šด ์ž์‹ ๋…ธ๋“œ๋ฅผ ํŠธ๋ฆฌ์— ์ถ”๊ฐ€. - **Simulation (Rollout):** ํ•ด๋‹น ๋…ธ๋“œ์—์„œ ๊ฒŒ์ž„์˜ ๋๊นŒ์ง€ ๋ฌด์ž‘์œ„๋กœ ์ง„ํ–‰ํ•˜์—ฌ ์ŠนํŒจ(๋ณด์ƒ) ํ™•์ธ. - **Backpropagation:** ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ฒฐ๊ณผ๋ฅผ ๊ฒฝ๋กœ์ƒ์˜ ๋ชจ๋“  ๋ถ€๋ชจ ๋…ธ๋“œ์— ์—…๋ฐ์ดํŠธํ•˜์—ฌ ๊ฐ€์น˜ ๊ฐฑ์‹ . - **์˜์˜:** ํœด๋ฆฌ์Šคํ‹ฑ ํ•จ์ˆ˜ ์—†์ด๋„ ๋ณต์žกํ•œ ๊ฒŒ์ž„์˜ ์ตœ์ ํ•ด๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒ ํ•˜์—ฌ, ์•ŒํŒŒ๊ณ ๋ฅผ ํฌํ•จํ•œ ํ˜„๋Œ€ ๋ณด๋“œ๊ฒŒ์ž„ AI ๋ฐ ๋กœ๋ด‡ ๊ฒฝ๋กœ ๊ณ„ํš์˜ ํ•ต์‹ฌ ๊ธฐ์ˆ ์ด ๋จ. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ์™„์ „ํ•œ ๋ฌด์ž‘์œ„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์— ์˜์กดํ•˜๋˜ ์ดˆ๊ธฐ ๋ฐฉ์‹์—์„œ, ์ด์ œ๋Š” ์‹ ๊ฒฝ๋ง(Policy/Value Network)์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์˜ ์ •ํ™•๋„์™€ ํƒ์ƒ‰ ํšจ์œจ์„ ๊ทน์ ์œผ๋กœ ๋†’์ธ 'Deep MCTS'๊ฐ€ ํ‘œ์ค€์ด ๋จ. - **์ •์ฑ… ๋ณ€ํ™”:** Antigravity ์—์ด์ „ํŠธ์˜ ๋ณต์žกํ•œ ๋ฌธ์ œ ํ•ด๊ฒฐ ์‹œ๋‚˜๋ฆฌ์˜ค(์˜ˆ: ๋‹ค๋‹จ๊ณ„ ์ฝ”๋“œ ๋ฆฌํŒฉํ† ๋ง ๊ฒฝ๋กœ ํƒ์ƒ‰) ์‹œ, ๊ฐ ๋‹จ๊ณ„์˜ ์ž ์žฌ์  ๋ฆฌ์Šคํฌ์™€ ์ด๋“์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด MCTS ๊ธฐ๋ฐ˜์˜ ์˜์‚ฌ๊ฒฐ์ • ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ™œ์šฉํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - [[Markov-Decision-Process-MDP]], [[Reinforcement-Learning]], [[Monte-Carlo-Integration]], Search-Algorithms, [[Game-Theory]] - **Raw Source:** 10_Wiki/Topics/AI/Monte-Carlo-Tree-Search-MCTS.md