--- id: P-REINFORCE-AI-MPC category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 0.98 tags: [Engineering, ControlTheory, MPC, Predictive] last_reinforced: 2026-04-20 --- # [[Model-Predictive-Control (MPC)|Model-Predictive-Control (MPC)]] (๋ชจ๋ธ ์˜ˆ์ธก ์ œ์–ด) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๋ช‡ ์ˆ˜ ์•ž์„ ๋‚ด๋‹ค๋ณด๊ณ  ํ˜„์žฌ์˜ ํ•ธ๋“ค์„ ๊บพ๋Š” ์ง€๋Šฅํ˜• ์กฐํƒ€์ˆ˜." ์‹œ์Šคํ…œ์˜ ์ˆ˜ํ•™์  ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ๋ฏธ๋ž˜์˜ ๊ฑฐ๋™์„ ์˜ˆ์ธกํ•˜๊ณ , ์ˆ˜์ฒœ ๋ฒˆ์˜ ๊ฐ€๊ณต ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•ด ํ˜„์žฌ ์‹œ์ ์—์„œ ์ตœ์„ ์˜ ์ œ์–ด ์ž…๋ ฅ์„ ๊ฒฐ์ •ํ•˜๋Š” ๊ณ ๋„์˜ ์ œ์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **Mechanism**: 1. ํ˜„์žฌ ์ƒํƒœ๋ฅผ ์ธก์ •ํ•จ. 2. ์ผ์ • ๊ธฐ๊ฐ„(Prediction Horizon) ๋™์•ˆ ์‹œ์Šคํ…œ์ด ์–ด๋–ป๊ฒŒ ์›€์ง์ผ์ง€ ๋ฏธ๋ž˜๋ฅผ ์˜ˆ์ธกํ•จ. 3. ์ œ์•ฝ ์กฐ๊ฑด(์˜ˆ: ์†๋„ 100km ์ œํ•œ)์„ ๋งŒ์กฑํ•˜๋ฉด์„œ ๊ฐ€์žฅ ๋ชฉํ‘œ์— ๊ทผ์ ‘ํ•˜๋Š” ์ž…๋ ฅ ์‹œํ€€์Šค๋ฅผ ๊ณ„์‚ฐ. 4. ๊ณ„์‚ฐ๋œ ์—ฌ๋Ÿฌ ์ˆ˜ ์ค‘ **์ฒซ ๋ฒˆ์งธ ๋ช…๋ น๋งŒ ์‹คํ–‰**ํ•˜๊ณ  ๋‹ค์‹œ 1๋ฒˆ์œผ๋กœ ๋Œ์•„๊ฐ (Receding Horizon). - **Strength**: ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์ด ์–ฝํžŒ ๋ณต์žกํ•œ ์‹œ์Šคํ…œ(MIMO)์„ ๋‹ค๋ฃจ๋Š” ๋ฐ ํƒ์›”ํ•˜๋ฉฐ, ์ œ์•ฝ ์กฐ๊ฑด์„ ํ•˜๋“œ์ฝ”๋”ฉ์œผ๋กœ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ๋‹ค. - **Domain**: ์ •์œ  ๊ณต์ •, ์šฐ์ฃผ์„  ๋„ํ‚น, ๊ณ ์„ฑ๋Šฅ ์ž์œจ์ฃผํ–‰ ์ฐจ๋Ÿ‰. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (RL Update) - MPC๋Š” ๋งค ์ˆœ๊ฐ„ ์ตœ์ ํ™” ๋ฌธ์ œ๋ฅผ ํ’€์–ด์•ผ ํ•˜๋ฏ€๋กœ ๊ณ„์‚ฐ ์„ฑ๋Šฅ์ด ์—„์ฒญ๋‚˜๊ฒŒ ์†Œ๋ชจ๋œ๋‹ค. ์ตœ๊ทผ์—๋Š” ๊ฐ•ํ™”ํ•™์Šต(RL)์ด MPC์˜ ์—ญํ• ์„ ๋Œ€์‹ ํ•˜๊ฑฐ๋‚˜, ๋ฐ˜๋Œ€๋กœ RL์ด ๊ฐˆ ๋ฐฉํ–ฅ์„ MPC๊ฐ€ ์ œ์•ฝ ์กฐ๊ฑด์œผ๋กœ ๊ฐ€์ด๋“œํ•ด์ฃผ๋Š” ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์ œ์–ด(Learning-based MPC)๊ฐ€ ๋กœ๋ณดํ‹ฑ์Šค์˜ ์ƒˆ๋กœ์šด ํ‘œ์ค€์ด ๋˜๊ณ  ์žˆ๋‹ค. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - Related: [[Control-Theory|Control-Theory]] , [[Decision Theory|Decision Theory]] - AI Hybrid: Deep-Reinforcement-Learning-for-Control