--- id: [[P-Reinforce]]-AI-COT category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 0.99 tags: [LLM, Chain-of-Thought, CoT, Inference, [[Search]]] last_reinforced: 2026-04-20 --- # Chain-of-Thought (์‚ฌ๊ณ ์˜ ์‚ฌ์Šฌ CoT) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > ๊ฑฐ๋Œ€ ์–ธ์–ด ๋ชจ๋ธ์—๊ฒŒ "์ƒ๊ฐํ•ด ๋ด"๋ผ๊ณ  ํ•œ๋งˆ๋”” ํ•˜๋Š” ๊ฒƒ๋งŒ์œผ๋กœ๋„, ๋ฌธ์ œ๋ฅผ ๋‹จ๊ณ„์ ์œผ๋กœ ๋ถ„ํ•ดํ•˜์—ฌ ์ •๋‹ต ๋„์ถœ ๊ฐ€๋Šฅ์„ฑ์„ ๋น„์•ฝ์ ์œผ๋กœ ๋†’์ด๋Š” ์ถ”๋ก ์˜ ๊ธฐ์ ์ด๋‹ค. ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **Step-by-Step [[Reasoning]]**: - ์งˆ๋ฌธ์— ๋ฐ”๋กœ ๋‹ตํ•˜์ง€ ์•Š๊ณ , ์ค‘๊ฐ„ ๊ณผ์ •(Rationales)์„ ํ…์ŠคํŠธ๋กœ ๋จผ์ € ์ƒ์„ฑํ•˜๊ฒŒ ์œ ๋„ํ•จ์œผ๋กœ์จ ๋ชจ๋ธ์ด ์ž์‹ ์˜ ์ด์ „ ์ถœ๋ ฅ์„ ๋‹ค์Œ ์ถ”๋ก ์˜ ๊ทผ๊ฑฐ๋กœ ํ™œ์šฉํ•˜๊ฒŒ ํ•˜๋Š” ๊ธฐ๋ฒ•. - **Zero-shot CoT**: - ํ”„๋กฌํ”„ํŠธ ๋์— "Let's think step by step"์ด๋ผ๋Š” ๋ฌธ๊ตฌ๋งŒ ์ถ”๊ฐ€ํ•ด๋„ ์ƒ์‹ ์ถ”๋ก ๊ณผ ์ˆ˜ํ•™ ๋ฌธ์ œ ํ•ด๊ฒฐ ๋Šฅ๋ ฅ์ด ํญ๋ฐœ์ ์œผ๋กœ ์ฆ๊ฐ€ํ•œ๋‹ค. - **Self-Consistency**: - ์—ฌ๋Ÿฌ ๊ฐœ์˜ CoT ๊ฒฝ๋กœ๋ฅผ ์ƒ์„ฑํ•˜๊ฒŒ ํ•˜์—ฌ ๊ฐ€์žฅ ๊ณตํ†ต์ ์œผ๋กœ ๋„์ถœ๋œ ๊ฒฐ๋ก ์„ ์ •๋‹ต์œผ๋กœ ์„ ํƒํ•˜๋Š” ๊ธฐ๋ฒ•. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (RL Update) - CoT๋Š” ํ•ญ์ƒ ์œ ๋ฆฌํ•˜์ง€ ์•Š๋‹ค. ๋‹จ์ˆœ ์‚ฌ์‹ค ํ™•์ธ ๋ฌธ์ œ์—์„œ๋Š” ์˜คํžˆ๋ ค ๋ถˆํ•„์š”ํ•œ ํ…์ŠคํŠธ ์ƒ์„ฑ์œผ๋กœ ์ธํ•ด ์—๋Ÿฌ(Hallucination)๊ฐ€ ๋ฐœ์ƒํ•  ํ™•๋ฅ ์ด ์žˆ๋‹ค. ์ตœ๊ทผ์—๋Š” ์ด๋ฅผ ๊ณ ๋„ํ™”ํ•œ `Tree-of-Thoughts (ToT)` ๋˜๋Š” `OpenAI o1`์ฒ˜๋Ÿผ ๋‚ด๋ถ€์ ์œผ๋กœ ๊ฐ•ํ™”ํ•™์Šต์„ ํ†ตํ•ด ์ตœ์ ์˜ ์‚ฌ๊ณ  ๊ฒฝ๋กœ๋ฅผ ์ฐพ๋Š” ๋ชจ๋ธ๋กœ ์ง„ํ™” ์ค‘์ด๋‹ค. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - Related: [[Best-of-N-Sampling]] , [[Automated-Reasoning]] - Foundation: [[Information Theory]]