--- id: wiki-2026-0508-mixture-of-experts-moe-sparse-ar title: "Mixture of Experts (MoE) & Sparse Architectures" category: 10_Wiki/Topics status: needs_review canonical_id: self aliases: [P-Reinforce-AUTO-MOES-001] duplicate_of: none source_trust_level: A confidence_score: 1.0 tags: [auto-reinforced, moe, mixture-of-experts, sparse-architecture, routing, compute-efficiency] raw_sources: [] last_reinforced: 2026-05-04 github_commit: pending inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08) tech_stack: language: unspecified framework: unspecified --- # [[Mixture of Experts (MoE) & Sparse Architectures|Mixture of Experts (MoE) & Sparse Architectures]] ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "์ง€๋Šฅ์˜ ๋ถ„์—…ํ™”: ๊ฑฐ๋Œ€ํ•œ ์ง€์‹์„ ๊ฐ€์ง„ ์ˆ˜๋งŽ์€ ์ „๋ฌธ๊ฐ€๋“ค์„ ๋ชจ๋ธ ์•ˆ์— ๋ฐฐ์น˜ํ•˜๊ณ , ๋งค ์ˆœ๊ฐ„ ํ•„์š”ํ•œ ์†Œ์ˆ˜์˜ ์ „๋ฌธ๊ฐ€๋งŒ ํ™œ์„ฑํ™”ํ•จ์œผ๋กœ์จ ๋ชจ๋ธ์˜ ํฌ๊ธฐ๋Š” ํ‚ค์šฐ๋˜ ์—ฐ์‚ฐ ๋น„์šฉ์€ ๋‚ฎ๊ฒŒ ์œ ์ง€ํ•˜๋Š” ๊ฒฝ์ œ์  ์ง€๋Šฅ ์„ค๊ณ„." ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) MoE(Mixture of Experts)๋Š” ๋ชจ๋ธ์˜ ์ „์ฒด ํŒŒ๋ผ๋ฏธํ„ฐ ์ค‘ ์ผ๋ถ€๋งŒ ์—ฐ์‚ฐ์— ์ฐธ์—ฌ์‹œํ‚ค๋Š” ํฌ์†Œ(Sparse) ๋ชจ๋ธ ์„ค๊ณ„ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. 1. **ํ•ต์‹ฌ ์›๋ฆฌ**: * **Experts (์ „๋ฌธ๊ฐ€)**: ๋ชจ๋ธ ๋‚ด๋ถ€์˜ FFN ๊ณ„์ธต์„ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋…๋ฆฝ๋œ '์ „๋ฌธ๊ฐ€' ๋„คํŠธ์›Œํฌ๋กœ ๋‚˜๋ˆ•๋‹ˆ๋‹ค. * **Router (๋ผ์šฐํ„ฐ)**: ์ž…๋ ฅ๋œ ํ† ํฐ๋ณ„๋กœ ๊ฐ€์žฅ ์ ํ•ฉํ•œ ์ „๋ฌธ๊ฐ€(๋ณดํ†ต ์ƒ์œ„ 1~2๊ฐœ)๋ฅผ ์„ ํƒํ•˜์—ฌ ์—ฐ์‚ฐ์„ ๋ณด๋ƒ…๋‹ˆ๋‹ค. * **Shared Experts (๊ณต์œ  ์ „๋ฌธ๊ฐ€)**: ํŠน์ • ๋ชจ๋ธ(์˜ˆ: DeepSeek)์€ ๋ชจ๋“  ํ† ํฐ์ด ๊ณตํ†ต์ ์œผ๋กœ ๊ฑฐ์น˜๋Š” '๊ณต์œ  ์ „๋ฌธ๊ฐ€'๋ฅผ ๋‘์–ด ์ง€์‹์˜ ๊ธฐ์ดˆ๋ฅผ ๋‹ค์ง‘๋‹ˆ๋‹ค. 2. **์ฃผ์š” ์žฅ์ **: * **์—ฐ์‚ฐ ํšจ์œจ์„ฑ**: ์ „์ฒด ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ 1์กฐ ๊ฐœ(1T)๋ผ๋„ ์ถ”๋ก  ์‹œ์—๋Š” ์ˆ˜์‹ญ์–ต ๊ฐœ๋งŒ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ์†๋„๊ฐ€ ๋น ๋ฆ…๋‹ˆ๋‹ค. * **ํ™•์žฅ์„ฑ**: ๋™์ผํ•œ ์ปดํ“จํŒ… ์ž์›์œผ๋กœ ๋” ๋ฐฉ๋Œ€ํ•œ ์ง€์‹์„ ๋‹ด์€ ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 3. **๋Œ€ํ‘œ์  ๋ชจ๋ธ**: * GPT-4 (์•Œ๋ ค์ง„ ๋ฐ”์— ๋”ฐ๋ฅด๋ฉด MoE ์•„ํ‚คํ…์ฒ˜), Mixtral 8x7B, DeepSeek-V3. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & Updates) * **VRAM ์ ์œ **: ์ถ”๋ก  ์—ฐ์‚ฐ์€ ์ ๊ฒŒ ํ•˜์ง€๋งŒ, ๋ชจ๋“  ์ „๋ฌธ๊ฐ€์˜ ๊ฐ€์ค‘์น˜๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์— ์˜ฌ๋ ค๋‘์–ด์•ผ ํ•˜๋ฏ€๋กœ ์š”๊ตฌ๋˜๋Š” VRAM ์šฉ๋Ÿ‰์€ ๋ชจ๋ธ์˜ ์ „์ฒด ํฌ๊ธฐ๋งŒํผ ํฝ๋‹ˆ๋‹ค. * **์ „๋ฌธ๊ฐ€ ๋ถ•๊ดด (Expert Collapse)**: ๋ผ์šฐํ„ฐ๊ฐ€ ํŠน์ • ์ „๋ฌธ๊ฐ€์—๊ฒŒ๋งŒ ์ผ์„ ๋ชฐ์•„์ฃผ์–ด ๋‚˜๋จธ์ง€ ์ „๋ฌธ๊ฐ€๋“ค์ด ํ•™์Šต๋˜์ง€ ์•Š๋Š” ํ˜„์ƒ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•œ ๋ถ€ํ•˜ ๋ถ„์‚ฐ(Load Balancing) ๊ธฐ์ˆ ์ด ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค. * **๋ฐฐํฌ ๋ณต์žก์„ฑ**: ์ „๋ฌธ๊ฐ€๋“ค์„ ์—ฌ๋Ÿฌ GPU์— ๋ถ„์‚ฐ ๋ฐฐ์น˜ํ•˜๊ณ  ๋™๊ธฐํ™”ํ•˜๋Š” ๊ณผ์ •์ด ์ผ๋ฐ˜ ๋ชจ๋ธ๋ณด๋‹ค ํ›จ์”ฌ ๊นŒ๋‹ค๋กญ์Šต๋‹ˆ๋‹ค. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) * **๊ธฐ๋ฐ˜ ๊ตฌ์กฐ**: [[Transformer Architecture|Transformer Architecture]] * **์—ฐ๊ด€ ๊ธฐ์ˆ **: [[Routing Mechanism|Routing Mechanism]], [[Sparse Attention|Sparse Attention]] * **๊ฒฝ์Ÿ ๊ตฌ์กฐ**: Dense Models (Llama 3 ๋“ฑ) --- *Last updated: 2026-05-04* ## ๐Ÿค– LLM ํ™œ์šฉ ํžŒํŠธ (How to Use This Knowledge) **์–ธ์ œ ์ด ์ง€์‹์„ ์“ฐ๋Š”๊ฐ€:** - *(TODO)* **์–ธ์ œ ์“ฐ๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€:** - *(TODO)* ## ๐Ÿงช ๊ฒ€์ฆ ์ƒํƒœ (Validation) - **์ •๋ณด ์ƒํƒœ:** needs_review - **์ถœ์ฒ˜ ์‹ ๋ขฐ๋„:** A - **๊ฒ€ํ†  ์ด์œ :** *(P-Reinforce Phase 1 ์ž๋™ ์ •๊ทœํ™”. ๋ณธ๋ฌธ ๊ฒ€์ฆ ํ•„์š”.)* ## ๐Ÿงฌ ์ค‘๋ณต ๊ฒ€์‚ฌ (Duplicate Check) - **๊ธฐ์กด ์œ ์‚ฌ ๋ฌธ์„œ:** *(TODO: ์ธ๋ฑ์„œ ํด๋Ÿฌ์Šคํ„ฐ ๋ฆฌํฌํŠธ ์ฐธ์กฐ)* - **์ฒ˜๋ฆฌ ๋ฐฉ์‹:** UPDATE (์ž๋™ ์ •๊ทœํ™”) - **์ฒ˜๋ฆฌ ์ด์œ :** Phase 1 ์ •๊ทœํ™” โ€” ์˜› ํ…œํ”Œ๋ฆฟ/๋ˆ„๋ฝ ํ•„๋“œ ๋ณด๊ฐ•. ## ๐Ÿ•“ ๋ณ€๊ฒฝ ์ด๋ ฅ (Changelog) | ๋‚ ์งœ | ๋ณ€๊ฒฝ ๋‚ด์šฉ | ์ฒ˜๋ฆฌ ๋ฐฉ์‹ | ์‹ ๋ขฐ๋„ | |------|-----------|-----------|--------| | 2026-05-08 | P-Reinforce Phase 1 ์ •๊ทœํ™” (frontmatter + ํ—ค๋” ํ‘œ์ค€ํ™”) | UPDATE | A | ## ๐Ÿ’ป ์ฝ”๋“œ ํŒจํ„ด (Code Patterns) **ํŒจํ„ด 1:** *(TODO: ์ด ํ”„๋กœ์ ํŠธ ์ปจ๋ฒค์…˜ ๋ฐ˜์˜ํ•œ ๊ตฌ์กฐ ์Šค์ผˆ๋ ˆํ†ค)* ```text # TODO ``` ## ๐Ÿค” ์˜์‚ฌ๊ฒฐ์ • ๊ธฐ์ค€ (Decision Criteria) **์„ ํƒ A๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **์„ ํƒ B๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **๊ธฐ๋ณธ๊ฐ’:** > *(TODO)* ## โŒ ์•ˆํ‹ฐํŒจํ„ด (Anti-Patterns) - **[์•ˆํ‹ฐํŒจํ„ด]:** *(TODO: ๋ฌด์—‡์„ ํ•˜๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€ + ์ด์œ  + ๋Œ€์‹  ๋ฌด์—‡์„)*