--- id: P-REINFORCE-AI-MC category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 0.97 tags: [AI, ModelCompression, Optimization, Efficiency] last_reinforced: 2026-04-20 --- # [[Model-Compression]] (๋ชจ๋ธ ์••์ถ•) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๊ฑฐ๋Œ€ ๋ชจ๋ธ์˜ ๊ทผ์œก์€ ์œ ์ง€ํ•˜๊ณ  ์ง€๋ฐฉ(์ค‘๋ณต)๋งŒ ๊ฑท์–ด๋‚ด์–ด ์Šค๋งˆํŠธํฐ์— ์‘ค์…” ๋„ฃ๋Š” ๊ธฐ์ˆ ." ๋†’์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ์œก์ค‘ํ•œ ๋ชจ๋ธ์„ ์ž‘์€ ์šฉ๋Ÿ‰๊ณผ ๋น ๋ฅธ ์†๋„๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ, ํด๋ผ์šฐ๋“œ๊ฐ€ ์—†๋Š” ํ™˜๊ฒฝ์—์„œ๋„ ์›ํ™œํ•˜๊ฒŒ ์ž‘๋™ํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๊ฒฝ๋Ÿ‰ํ™” ์ „๋žต์ด๋‹ค. ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **The Core Techniques**: - **Pruning (๊ฐ€์ง€์น˜๊ธฐ)**: ์„ฑ๋Šฅ์— ์ง€์žฅ ์—†๋Š” ์ค‘์š”๋„ ๋‚ฎ์€ ๊ฐ€์ค‘์น˜(๋‰ด๋Ÿฐ)๋ฅผ 0์œผ๋กœ ๋งŒ๋“ค์–ด ์ œ๊ฑฐ. - **Quantization (์–‘์žํ™”)**: 32๋น„ํŠธ ์‹ค์ˆ˜๋ฅผ 8๋น„ํŠธ ์ •์ˆ˜๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์—ฐ์‚ฐ๋Ÿ‰๊ณผ ๋ฉ”๋ชจ๋ฆฌ ํš๊ธฐ์  ์ ˆ๊ฐ. - **Knowledge Distillation (์ง€์‹ ์ฆ๋ฅ˜)**: ํฐ ๋ชจ๋ธ(Teacher)์˜ ์ง€์‹์„ ์ž‘์€ ๋ชจ๋ธ(Student)์ด ํšจ์œจ์ ์œผ๋กœ ํก์ˆ˜ํ•˜๊ฒŒ ํ•จ. - **Weight Sharing**: ๊ณตํ†ต๋œ ๊ฐ€์ค‘์น˜ ๊ฐ’์„ ๊ณต์œ ํ•˜์—ฌ ์ˆซ์ž์˜ ๊ฐ€์ง“์ˆ˜๋ฅผ ์ค„์ž„. - **Benefit**: ๋ฐฐํ„ฐ๋ฆฌ ์†Œ๋ชจ ๊ฐ์†Œ, ์‹ค์‹œ๊ฐ„ ๋ฐ˜์‘์„ฑ ํ™•๋ณด, ๊ฐœ์ธ์ •๋ณด ๋ณดํ˜ธ(On-device AI). ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (RL Update) - ์••์ถ•์ด ๋„ˆ๋ฌด ์‹ฌํ•˜๋ฉด ๋ชจ๋ธ์˜ '์ƒ์‹'์ด๋‚˜ 'ํฌ๊ท€ ์ผ€์ด์Šค ๋Œ€์‘๋ ฅ'์ด ๊ธ‰๊ฒฉํžˆ ๋ฌด๋„ˆ์ง€๋Š” ํ˜„์ƒ(Performance Degradation)์ด ๋ฐœ์ƒํ•œ๋‹ค. ์ตœ๊ทผ์—๋Š” ๋‹จ์ˆœํžˆ ์••์ถ•ํ•˜๋Š” ๊ฒƒ์„ ๋„˜์–ด, ์••์ถ•๋œ ์ƒํƒœ์—์„œ ๋‹ค์‹œ ํ›ˆ๋ จ์‹œ์ผœ ์„ฑ๋Šฅ์„ ๋ณต์›ํ•˜๋Š” 'Quantization-aware Training'์ด ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ ์„œ๋น™์˜ ํ•„์ˆ˜ ๊ธฐ์ˆ ์ด ๋˜์—ˆ๋‹ค. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - Related: [[Knowledge-Distillation]] , Low-Rank Adaptation (LoRA) - Hardware: Edge-AI