--- id: SYS-PIPE-PAR-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 1.0 tags: [infrastructure, [[Parallel-Computing|Parallel-Computing]], pipeline-parallelism, distributed-training, llm-training, gpu-[[Optimization|Optimization]]] last_reinforced: 2026-04-26 --- # Pipeline Parallelism (ํŒŒ์ดํ”„๋ผ์ธ ๋ณ‘๋ ฌ์„ฑ) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๊ฑฐ๋Œ€ํ•œ ๋ชจ๋ธ์„ ํ•œ ๊ทธ๋ฆ‡์— ๋‹ด์œผ๋ ค ํ•˜์ง€ ๋ง๊ณ , ์—ฌ๋Ÿฌ ์žฅ์น˜์— ์ธต์ธต์ด ๋‚˜๋ˆ„์–ด ๋ฐฐ์น˜ํ•œ ๋’ค ๋ฐ์ดํ„ฐ์˜ ์ปจ๋ฒ ์ด์–ด ๋ฒจํŠธ๋ฅผ ๊ฐ€๋™ํ•˜๋ผ" โ€” ๋ชจ๋ธ์˜ ๋ ˆ์ด์–ด๋“ค์„ ์—ฌ๋Ÿฌ ๊ฐœ์˜ GPU์— ๋ถ„์‚ฐ ๋ฐฐ์น˜ํ•˜๊ณ , ๋ฐ์ดํ„ฐ๋ฅผ ์ˆœ์ฐจ์ ์œผ๋กœ ํ†ต๊ณผ์‹œ์ผœ ์—ฐ์‚ฐ๊ณผ ํ†ต์‹ ์„ ์ค‘์ฒฉํ•จ์œผ๋กœ์จ ํ•™์Šต ํšจ์œจ์„ ๋†’์ด๋Š” ๋ถ„์‚ฐ ํ•™์Šต ๊ธฐ์ˆ . ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **์ถ”์ถœ๋œ ํŒจํ„ด:** "Sequential Layer Partitioning and Micro-[[Batching|Batching]]" โ€” ๋ชจ๋ธ์„ ์ˆ˜์ง์œผ๋กœ ์ชผ๊ฐœ์–ด ์žฅ์น˜๋ณ„๋กœ ํ• ๋‹นํ•˜๊ณ , ์•ž์„  ์žฅ์น˜์˜ ์—ฐ์‚ฐ์ด ๋๋‚  ๋•Œ๊นŒ์ง€ ๋’ท ์žฅ์น˜๊ฐ€ ๋…ธ๋Š” ์‹œ๊ฐ„(Bubble)์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋ฏธ๋‹ˆ ๋ฐฐ์น˜๋ฅผ ๋” ์ž‘์€ ๋งˆ์ดํฌ๋กœ ๋ฐฐ์น˜๋กœ ์ชผ๊ฐœ์–ด ๋Š์ž„์—†์ด ํŒŒ์ดํ”„๋ผ์ธ์„ ์ฑ„์šฐ๋Š” ํŒจํ„ด. - **ํ•ต์‹ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜:** - **GPipe:** ๋งˆ์ดํฌ๋กœ ๋ฐฐ์น˜๋ฅผ ํ†ตํ•ด ์œ ํœด ์‹œ๊ฐ„์„ ์ค„์ธ ์ตœ์ดˆ์˜ ํ‘œ์ค€ ํŒŒ์ดํ”„๋ผ์ธ ๋ณ‘๋ ฌํ™”. - **PipeDream:** ์ „๋ฐฉ ๊ณ„์‚ฐ(Forward)๊ณผ ํ›„๋ฐฉ ๊ณ„์‚ฐ(Backward)์„ ๋น„๋™๊ธฐ์ ์œผ๋กœ ์ค‘์ฒฉ์‹œ์ผœ ํšจ์œจ ๊ทน๋Œ€ํ™”. - **์˜์˜:** ๋‹จ์ผ GPU ๋ฉ”๋ชจ๋ฆฌ ์šฉ๋Ÿ‰์„ ์ดˆ๊ณผํ•˜๋Š” ์ˆ˜์ฒœ์–ต ํŒŒ๋ผ๋ฏธํ„ฐ ๊ทœ๋ชจ์˜ ์ดˆ๊ฑฐ๋Œ€ ์–ธ์–ด ๋ชจ๋ธ(LLM) ํ•™์Šต์„ ๊ฐ€๋Šฅ์ผ€ ํ•˜๋Š” ๋ฌผ๋ฆฌ์  ์ธํ”„๋ผ์˜ ํ•„์ˆ˜ ๊ตฌ์„ฑ ์š”์†Œ. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ๋‹จ์ˆœํžˆ ์ธต์„ ๋‚˜๋ˆ„๋ฉด ํ†ต์‹  ์˜ค๋ฒ„ํ—ค๋“œ ๋•Œ๋ฌธ์— ๋А๋ ค์งˆ ๊ฒƒ์ด๋ผ๋Š” ์šฐ๋ ค๋ฅผ '๋งˆ์ดํฌ๋กœ ๋ฐฐ์น˜'์™€ '๋น„๋™๊ธฐ ํ†ต์‹ ' ๊ธฐ์ˆ ๋กœ ๊ทน๋ณตํ•˜๋ฉฐ, ์ด์ œ๋Š” ๋ฐ์ดํ„ฐ ๋ณ‘๋ ฌํ™”(DP) ๋ฐ ํ…์„œ ๋ณ‘๋ ฌํ™”(TP)์™€ ๊ฒฐํ•ฉ๋œ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๋ณ‘๋ ฌํ™”(3D Parallelism)๊ฐ€ ํ‘œ์ค€์ด ๋จ. - **์ •์ฑ… ๋ณ€ํ™”:** Antigravity ํ”„๋กœ์ ํŠธ๋Š” ๋Œ€๊ทœ๋ชจ ์ง€์‹ ๋ชจ๋ธ์˜ ํŒŒ์ธํŠœ๋‹ ์‹œ, GPU ์ž์› ์ ์œ ์œจ์„ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•ด ํŒŒ์ดํ”„๋ผ์ธ ๋ณ‘๋ ฌํ™” ๊ธฐ๋ฐ˜์˜ ๋ถ„์‚ฐ ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ(DeepSpeed ๋“ฑ)๋ฅผ ์ ๊ทน ํ™œ์šฉํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - [[Parallel-Computing-in-AI|Parallel-Computing-in-AI]], [[NVIDIA-CUDA-and-AI|NVIDIA-CUDA-and-AI]],[[_system|system]]-Design-for-AI-Scale, LLM-Training-Foundations - **Raw Source:** 10_Wiki/Topics/AI/Pipeline-Parallelism.md