--- id: AI-MOB-OPT-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 1.0 tags: [ai, mobile-ai, optimization, quantization, on-device-ai, edge-computing] last_reinforced: 2026-04-26 --- # Mobile AI Optimization (๋ชจ๋ฐ”์ผ AI ์ตœ์ ํ™”) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๊ฑฐ๋Œ€ ๋ชจ๋ธ์˜ ์ง€๋Šฅ์„ ์Šค๋งˆํŠธํฐ์ด๋ผ๋Š” ์ข์€ ํ‹€์— ๋งž์ถฐ ์••์ถ•ํ•˜๋˜, ๊ทธ ํ•ต์‹ฌ์ ์ธ ์‚ฌ๊ณ ์˜ ๊นŠ์ด๋Š” ์žƒ์ง€ ๋งˆ๋ผ" โ€” ๋ชจ๋ฐ”์ผ ๊ธฐ๊ธฐ์˜ ์ œํ•œ๋œ ์ปดํ“จํŒ… ์ž์›(CPU, GPU, NPU)๊ณผ ๋ฐฐํ„ฐ๋ฆฌ ํ™˜๊ฒฝ์—์„œ AI ๋ชจ๋ธ์ด ์ง€์—ฐ ์‹œ๊ฐ„ ์—†์ด ํšจ์œจ์ ์œผ๋กœ ์ž‘๋™ํ•˜๋„๋ก ์ˆ˜ํ–‰ํ•˜๋Š” ๋ชจ๋ธ ๊ฒฝ๋Ÿ‰ํ™” ๋ฐ ์‹คํ–‰ ์ตœ์ ํ™” ๊ธฐ์ˆ . ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **์ถ”์ถœ๋œ ํŒจํ„ด:** "Lightweight Inference and Hardware Awareness" โ€” ๋ชจ๋ธ์˜ ์ •ํ™•๋„ ์†์‹ค์„ ์ตœ์†Œํ™”ํ•˜๋ฉด์„œ ํŒŒ๋ผ๋ฏธํ„ฐ ํฌ๊ธฐ๋ฅผ ์ค„์ด๊ณ , ํƒ€๊ฒŸ ๊ธฐ๊ธฐ์˜ ์ „์šฉ ๊ฐ€์†๊ธฐ(NPU ๋“ฑ)๋ฅผ ์ตœ๋Œ€๋กœ ํ™œ์šฉํ•˜์—ฌ ์‹ค์‹œ๊ฐ„์„ฑ์— ๊ฐ€๊นŒ์šด ์ถ”๋ก  ์†๋„๋ฅผ ํ™•๋ณดํ•˜๋Š” ํ•˜๋“œ์›จ์–ด ์นœํ™”์  ์ตœ์ ํ™” ํŒจํ„ด. - **์ฃผ์š” ์ตœ์ ํ™” ๊ธฐ๋ฒ•:** - **Quantization (์–‘์žํ™”):** 32๋น„ํŠธ ๋ถ€๋™์†Œ์ˆ˜์ ์„ 8๋น„ํŠธ ์ •์ˆ˜ ๋“ฑ์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์šฉ๋Ÿ‰๊ณผ ์—ฐ์‚ฐ ์†๋„ ๊ฐœ์„ . - **Pruning (๊ฐ€์ง€์น˜๊ธฐ):** ์„ฑ๋Šฅ์— ๊ธฐ์—ฌ๋„๊ฐ€ ๋‚ฎ์€ ๊ฐ€์ค‘์น˜๋ฅผ ์ œ๊ฑฐํ•˜์—ฌ ๋ชจ๋ธ ํฌ๊ธฐ ์ถ•์†Œ. - **Knowledge Distillation (์ง€์‹ ์ฆ๋ฅ˜):** ํฐ ๋ชจ๋ธ(Teacher)์˜ ์ง€์‹์„ ์ž‘์€ ๋ชจ๋ธ(Student)๋กœ ์ „์ˆ˜. - **Hardware Acceleration:** CoreML, TensorFlow Lite, ONNX ๋“ฑ์„ ํ™œ์šฉํ•œ ํ•˜๋“œ์›จ์–ด ์ตœ์ ํ™” ๋Ÿฐํƒ€์ž„ ์‚ฌ์šฉ. - **์˜์˜:** ์„œ๋ฒ„ ์—ฐ๊ฒฐ ์—†์ด๋„ ๊ฐœ์ธ์ •๋ณด๋ฅผ ๋ณดํ˜ธํ•˜๋ฉฐ ์˜คํ”„๋ผ์ธ์—์„œ ์ฆ‰๊ฐ ์‘๋‹ตํ•˜๋Š” '์˜จ๋””๋ฐ”์ด์Šค AI(On-device AI)' ์‹œ๋Œ€๋ฅผ ์—ฌ๋Š” ํ•ต์‹ฌ ๊ธฐ์ˆ . ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ๋ชจ๋ฐ”์ผ AI๋Š” ๋‹จ์ˆœํžˆ '์ž‘์€ ๋ชจ๋ธ'์„ ์˜๋ฏธํ–ˆ์œผ๋‚˜, ์ด์ œ๋Š” ๊ธฐ๊ธฐ ์ž์ฒด์—์„œ ์†Œ๊ทœ๋ชจ ํ•™์Šต์ด๋‚˜ ๊ฐœ์ธํ™”๊ฐ€ ๊ฐ€๋Šฅํ•œ ์ง€๋Šฅํ˜• ์—ฃ์ง€ ๋…ธ๋“œ๋กœ ์—ญํ• ์ด ๊ณ ๋„ํ™”๋จ. - **์ •์ฑ… ๋ณ€ํ™”:** Antigravity ํ”„๋กœ์ ํŠธ๋Š” ๋ชจ๋ฐ”์ผ ์ธํ„ฐํŽ˜์ด์Šค๋‚˜ ๋กœ์ปฌ ์—์ด์ „ํŠธ ๋ธŒ๋ ˆ์ธ ๊ตฌํ˜„ ์‹œ, 4๋น„ํŠธ ์–‘์žํ™” ๊ธฐ์ˆ ์„ ์ ์šฉํ•˜์—ฌ ์ตœ์†Œํ•œ์˜ ๋ฉ”๋ชจ๋ฆฌ ์ ์œ ๋กœ ์ตœ๋Œ€์˜ ์ถ”๋ก  ์„ฑ๋Šฅ์„ ๋ณด์žฅํ•˜๋Š” ๊ฒƒ์„ ์›์น™์œผ๋กœ ํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - [[Inference-Optimization]], [[Knowledge-Distillation]], [[Hardware-Acceleration-for-AI]], [[Local-Brain-Management]] - **Raw Source:** 10_Wiki/Topics/AI/Mobile-AI-Optimization.md