--- id: AI-COMP-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 1.0 tags: [ai, deep-learning, model-compression, quantization, pruning, efficient-ai] last_reinforced: 2026-04-26 --- # Model Compression Strategies (๋ชจ๋ธ ์••์ถ• ์ „๋žต) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๋ชจ๋ธ์˜ ์ง€๋Šฅ์€ ๋ณด์กดํ•˜๋˜ ๊ทธ ๋ชธ์ง‘(Parameters)์„ ์ค„์—ฌ, ํด๋ผ์šฐ๋“œ์˜ ํ•œ๊ณ„๋ฅผ ๋„˜์–ด ๋ชจ๋“  ๊ธฐ๊ธฐ์—์„œ ์ง€๋Šฅ์ด ์ˆจ ์‰ฌ๊ฒŒ ํ•˜๋ผ" โ€” ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ํฌ๊ธฐ์™€ ์—ฐ์‚ฐ ๋ณต์žก๋„๋ฅผ ์ค„์—ฌ ์ถ”๋ก  ์†๋„๋ฅผ ๋†’์ด๊ณ  ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ์ ˆ๊ฐํ•˜๋Š” ๊ธฐ์ˆ ์  ๋ฐฉ๋ฒ•๋ก . ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **์ถ”์ถœ๋œ ํŒจํ„ด:** "Redundancy Reduction and Precision Scaling" โ€” ์‹ ๊ฒฝ๋ง ๋‚ด์˜ ๋ถˆํ•„์š”ํ•œ ์—ฐ๊ฒฐ์„ ์ œ๊ฑฐํ•˜๊ฑฐ๋‚˜ ์ˆ˜์น˜์˜ ์ •๋ฐ€๋„๋ฅผ ์กฐ์ ˆํ•จ์œผ๋กœ์จ, ๋ชจ๋ธ์˜ ์ •ํ™•๋„ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋ฉฐ ์ž์› ์ ์œ ์œจ์„ ํš๊ธฐ์ ์œผ๋กœ ๋‚ฎ์ถ”๋Š” ์••์ถ• ํŒจํ„ด. - **์ฃผ์š” ์ „๋žต:** - **Quantization:** 32๋น„ํŠธ ๊ฐ€์ค‘์น˜๋ฅผ 8๋น„ํŠธ๋‚˜ 4๋น„ํŠธ ์ •์ˆ˜๋กœ ๋ณ€ํ™˜. ์—ฐ์‚ฐ ์†๋„์™€ ์—๋„ˆ์ง€ ํšจ์œจ ๊ทน๋Œ€ํ™”. - **Weight Pruning:** ์ค‘์š”๋„๊ฐ€ ๋‚ฎ์€ ๊ฐ€์ค‘์น˜๋ฅผ 0์œผ๋กœ ๋งŒ๋“ค์–ด ๋ชจ๋ธ์„ ํฌ์†Œ(Sparse)ํ•˜๊ฒŒ ๋งŒ๋“ฆ. - **Knowledge Distillation:** ๊ฑฐ๋Œ€ ๋ชจ๋ธ์˜ ์ง€์‹์„ ๊ฐ€๋ณ๊ณ  ๋น ๋ฅธ ์†Œํ˜• ๋ชจ๋ธ๋กœ ์ „์ด. - **Low-Rank Factorization:** ํฐ ํ–‰๋ ฌ์„ ์ž‘์€ ํ–‰๋ ฌ๋“ค์˜ ๊ณฑ์œผ๋กœ ๋ถ„ํ•ดํ•˜์—ฌ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜ ๊ฐ์†Œ. - **์˜์˜:** AI ๋ชจ๋ธ์ด ์—ฐ๊ตฌ์‹ค์„ ๋„˜์–ด ๋ชจ๋ฐ”์ผ, IoT, ์ž๋™์ฐจ ๋“ฑ ์‹ค์ƒํ™œ์˜ ๋ชจ๋“  ์ ‘์ ์—์„œ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ž‘๋™ํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ํ•ต์‹ฌ ์ธํ”„๋ผ ๊ธฐ์ˆ . ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ์••์ถ•์€ ํ•ญ์ƒ ์„ฑ๋Šฅ ์ €ํ•˜๋ฅผ ๋™๋ฐ˜ํ•œ๋‹ค๋Š” ์ธ์‹์„ ๋„˜์–ด, ์ด์ œ๋Š” ์ ์ ˆํ•œ ์••์ถ•๊ณผ ๋ฏธ์„ธ ์กฐ์ •์„ ํ†ตํ•ด ์˜คํžˆ๋ ค ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๊ณ  ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ๋†’์ด๋Š” ์‚ฌ๋ก€๊ฐ€ ์ฆ๊ฐ€ํ•จ. - **์ •์ฑ… ๋ณ€ํ™”:** Antigravity ํ”„๋กœ์ ํŠธ๋Š” ๋ชจ๋“  ๋ฐฐํฌ์šฉ ๋ชจ๋ธ์— ๋Œ€ํ•ด ์ตœ์†Œ 8๋น„ํŠธ ์ด์ƒ์˜ ์–‘์žํ™” ๊ฒ€์ฆ์„ ํ•„์ˆ˜ํ™”ํ•˜์—ฌ, ์—์ด์ „ํŠธ์˜ ์‘๋‹ต ์†๋„๋ฅผ ์ตœ์šฐ์„ ์œผ๋กœ ๊ด€๋ฆฌํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - [[Mobile-AI-Optimization|Mobile-AI-Optimization]], [[Knowledge-Distillation|Knowledge-Distillation]], [[Inference-Optimization|Inference-Optimization]], [[Low-Rank-Adaptation-LoRA|Low-Rank-Adaptation-LoRA]] - **Raw Source:** 10_Wiki/Topics/AI/Model-Compression-Strategies.md