--- id: wiki-2026-0508-gpu-infrastructure title: GPU Infrastructure category: 10_Wiki/Topics status: needs_review canonical_id: self aliases: [P-Reinforce-AUTO-GPUF-001] duplicate_of: none source_trust_level: A confidence_score: 1.0 tags: [auto-reinforced, gpu-infrastructure, hbm, nvlink, infiniband, distributed-computing] raw_sources: [] last_reinforced: 2026-05-04 github_commit: pending inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08) tech_stack: language: unspecified framework: unspecified --- # [[GPU Infrastructure|GPU Infrastructure]] ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๊ฑฐ๋Œ€ ์ง€๋Šฅ์„ ์ง€ํƒฑํ•˜๋Š” ์‹ ๊ฒฝ๋ง๊ณผ ๊ทผ์œก: ์ดˆ๋‹น ํ…Œ๋ผ๋ฐ”์ดํŠธ๊ธ‰์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์Ÿ์•„๋‚ด๋Š” ๋ฉ”๋ชจ๋ฆฌ(HBM)์™€ GPU๋“ค์„ ๊ด‘์†์œผ๋กœ ์—ฐ๊ฒฐํ•˜๋Š” ์‹ ๊ฒฝ๋ง(NVLink)์ด ๊ฒฐํ•ฉ๋œ, ํ˜„๋Œ€ AI์˜ ๋ฌผ๋ฆฌ์  ์œก์ฒด." ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) ๊ฑฐ๋Œ€ ์–ธ์–ด ๋ชจ๋ธ์˜ ํ•™์Šต๊ณผ ์ดˆ์žฅ๊ฑฐ๋ฆฌ ๋ฌธ๋งฅ ์ฒ˜๋ฆฌ๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ๋ฌผ๋ฆฌ์  ํ•˜๋“œ์›จ์–ด ์•„ํ‚คํ…์ฒ˜์˜ ํ•ต์‹ฌ ์š”์†Œ๋“ค์ž…๋‹ˆ๋‹ค. 1. **HBM (High Bandwidth Memory)**: * **์ •์˜**: GPU ์นฉ ์˜†์— ์ˆ˜์ง์œผ๋กœ ์Œ“์•„ ์˜ฌ๋ฆฐ ์ดˆ๊ณ ์† ์ ์ธตํ˜• ๋ฉ”๋ชจ๋ฆฌ์ž…๋‹ˆ๋‹ค. * **์˜์˜**: ์ผ๋ฐ˜ GDDR ๋ฉ”๋ชจ๋ฆฌ๋ณด๋‹ค ๋Œ€์—ญํญ์ด ์••๋„์ ์œผ๋กœ ๋„“์–ด, ์–ดํ…์…˜ ์—ฐ์‚ฐ ์‹œ ๋ฐœ์ƒํ•˜๋Š” ๋ฐ์ดํ„ฐ ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒฐ์ •์  ์š”์†Œ์ž…๋‹ˆ๋‹ค. 2. **NVLink**: * **์ •์˜**: ๋™์ผ ์„œ๋ฒ„ ๋‚ด์˜ GPU๋“ค์„ ์„œ๋กœ ์—ฐ๊ฒฐํ•˜๋Š” NVIDIA์˜ ์ „์šฉ ์ดˆ๊ณ ์† ์ธํ„ฐ์ปค๋„ฅํŠธ ๊ธฐ์ˆ ์ž…๋‹ˆ๋‹ค. * **์—ญํ• **: ์ˆ˜์ฒœ์–ต ๊ฐœ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ GPU์— ๋‚˜๋ˆ„์–ด ํ•™์Šตํ•  ๋•Œ(๋ชจ๋ธ ๋ณ‘๋ ฌํ™”), GPU ๊ฐ„์˜ ๋ฐ์ดํ„ฐ ๊ตํ™˜ ์†๋„๋ฅผ ๊ทน๋Œ€ํ™”ํ•˜์—ฌ ํ†ต์‹  ์ง€์—ฐ์„ ์ตœ์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค. 3. **InfiniBand**: * **์ •์˜**: ์„œ๋ฒ„์™€ ์„œ๋ฒ„ ์‚ฌ์ด(๋…ธ๋“œ ๊ฐ„)๋ฅผ ์—ฐ๊ฒฐํ•˜๋Š” ๋ฐ์ดํ„ฐ์„ผํ„ฐ ๊ธ‰ ์ดˆ๊ณ ์† ๋„คํŠธ์›Œํฌ ๊ธฐ์ˆ ์ž…๋‹ˆ๋‹ค. * **์˜์˜**: ์ˆ˜์ฒœ ๋Œ€์˜ GPU๋ฅผ ํ•˜๋‚˜์˜ ๊ฑฐ๋Œ€ํ•œ ํด๋Ÿฌ์Šคํ„ฐ๋กœ ๋ฌถ์–ด ๊ฑฐ๋Œ€ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ฌ ๋•Œ, ๋„คํŠธ์›Œํฌ ๋ณ‘๋ชฉ ์—†์ด ๋ฐ์ดํ„ฐ๋ฅผ ์ „์†กํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & Updates) * **๋น„์šฉ ๋ฐ ์ „๋ ฅ**: ์ตœ์‹  HBM3e์™€ NVLink๊ฐ€ ํƒ‘์žฌ๋œ GPU ์‹œ์Šคํ…œ(์˜ˆ: NVIDIA HGX)์€ ๋Œ€๋‹น ์ˆ˜์–ต ์›์„ ํ˜ธ๊ฐ€ํ•˜๋ฉฐ, ๋ง‰๋Œ€ํ•œ ์ „๋ ฅ์„ ์†Œ๋ชจํ•ฉ๋‹ˆ๋‹ค. * **ํ†ต์‹  ๋ณ‘๋ชฉ**: ์•„๋ฌด๋ฆฌ GPU ์—ฐ์‚ฐ์ด ๋นจ๋ผ๋„ NVLink๋‚˜ InfiniBand์˜ ๋Œ€์—ญํญ์ด ์ด๋ฅผ ๋”ฐ๋ผ๊ฐ€์ง€ ๋ชปํ•˜๋ฉด, GPU๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋‹ค๋ฆฌ๋ฉฐ ๋…ธ๋Š” ์œ ํœด ์ƒํƒœ(Waiting)๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ ์ „์ฒด ํšจ์œจ์ด ๊ธ‰๊ฐํ•ฉ๋‹ˆ๋‹ค. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) * **์ƒ์œ„ ๊ฐœ๋…**: [[Distributed Training|Distributed Training]], [[Hardware Acceleration|Hardware Acceleration]] * **๊ด€๋ จ ๊ธฐ์ˆ **: [[Context Parallelism|Context Parallelism]], [[Ring Attention|Ring Attention]], [[Flash Attention|Flash Attention]] * **์žฅ์น˜**: NVIDIA H100/H200, B100/B200 (Blackwell) --- *Last updated: 2026-05-04* ## ๐Ÿค– LLM ํ™œ์šฉ ํžŒํŠธ (How to Use This Knowledge) **์–ธ์ œ ์ด ์ง€์‹์„ ์“ฐ๋Š”๊ฐ€:** - *(TODO)* **์–ธ์ œ ์“ฐ๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€:** - *(TODO)* ## ๐Ÿงช ๊ฒ€์ฆ ์ƒํƒœ (Validation) - **์ •๋ณด ์ƒํƒœ:** needs_review - **์ถœ์ฒ˜ ์‹ ๋ขฐ๋„:** A - **๊ฒ€ํ†  ์ด์œ :** *(P-Reinforce Phase 1 ์ž๋™ ์ •๊ทœํ™”. ๋ณธ๋ฌธ ๊ฒ€์ฆ ํ•„์š”.)* ## ๐Ÿงฌ ์ค‘๋ณต ๊ฒ€์‚ฌ (Duplicate Check) - **๊ธฐ์กด ์œ ์‚ฌ ๋ฌธ์„œ:** *(TODO: ์ธ๋ฑ์„œ ํด๋Ÿฌ์Šคํ„ฐ ๋ฆฌํฌํŠธ ์ฐธ์กฐ)* - **์ฒ˜๋ฆฌ ๋ฐฉ์‹:** UPDATE (์ž๋™ ์ •๊ทœํ™”) - **์ฒ˜๋ฆฌ ์ด์œ :** Phase 1 ์ •๊ทœํ™” โ€” ์˜› ํ…œํ”Œ๋ฆฟ/๋ˆ„๋ฝ ํ•„๋“œ ๋ณด๊ฐ•. ## ๐Ÿ•“ ๋ณ€๊ฒฝ ์ด๋ ฅ (Changelog) | ๋‚ ์งœ | ๋ณ€๊ฒฝ ๋‚ด์šฉ | ์ฒ˜๋ฆฌ ๋ฐฉ์‹ | ์‹ ๋ขฐ๋„ | |------|-----------|-----------|--------| | 2026-05-08 | P-Reinforce Phase 1 ์ •๊ทœํ™” (frontmatter + ํ—ค๋” ํ‘œ์ค€ํ™”) | UPDATE | A | ## ๐Ÿ’ป ์ฝ”๋“œ ํŒจํ„ด (Code Patterns) **ํŒจํ„ด 1:** *(TODO: ์ด ํ”„๋กœ์ ํŠธ ์ปจ๋ฒค์…˜ ๋ฐ˜์˜ํ•œ ๊ตฌ์กฐ ์Šค์ผˆ๋ ˆํ†ค)* ```text # TODO ``` ## ๐Ÿค” ์˜์‚ฌ๊ฒฐ์ • ๊ธฐ์ค€ (Decision Criteria) **์„ ํƒ A๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **์„ ํƒ B๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **๊ธฐ๋ณธ๊ฐ’:** > *(TODO)* ## โŒ ์•ˆํ‹ฐํŒจํ„ด (Anti-Patterns) - **[์•ˆํ‹ฐํŒจํ„ด]:** *(TODO: ๋ฌด์—‡์„ ํ•˜๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€ + ์ด์œ  + ๋Œ€์‹  ๋ฌด์—‡์„)*