--- id: SYS-DESIGN-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 1.0 tags: [system-design, scalability, ai-infrastructure, distributed-systems, mlops] last_reinforced: 2026-04-26 --- # System Design for AI Scale (AI μŠ€μΌ€μΌμ„ μœ„ν•œ μ‹œμŠ€ν…œ λ””μžμΈ) ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "λͺ¨λΈμ΄ 컀져도 λ¬΄λ„ˆμ§€μ§€ μ•ŠλŠ” κ²¬κ³ ν•œ μ§€λŠ₯의 κ³ μ†λ„λ‘œλ₯Ό 닦아라" β€” 수쑰 개의 νŒŒλΌλ―Έν„°μ™€ νŽ˜νƒ€λ°”μ΄νŠΈκΈ‰ 데이터λ₯Ό λ‹€λ£¨λŠ” λŒ€κ·œλͺ¨ AI μ„œλΉ„μŠ€μ˜ κ°€μš©μ„±, ν™•μž₯μ„±, μ§€μ—° μ‹œκ°„ μ΅œμ ν™”λ₯Ό μœ„ν•œ μ•„ν‚€ν…μ²˜ 섀계. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) - **μΆ”μΆœλœ νŒ¨ν„΄:** μ—°μ‚° 집약적인 AI μΆ”λ‘ κ³Ό ν•™μŠ΅ 과정을 λΆ„μ‚° μ²˜λ¦¬ν•˜κ³ , 병λͺ© ν˜„μƒ(I/O, Network)을 μ œκ±°ν•˜μ—¬ μ‹œμŠ€ν…œ μ „μ²΄μ˜ νš¨μœ¨μ„ κ·ΉλŒ€ν™”ν•˜λŠ” κ³ μ„±λŠ₯ μ‹œμŠ€ν…œ 섀계 νŒ¨ν„΄. - **핡심 μš”μ†Œ:** - **Load Balancing for AI:** GPU μžμ›μ˜ λΆ€ν•˜λ₯Ό λΆ„μ‚°ν•˜κ³  졜적의 μΆ”λ‘  μ„œλ²„λ‘œ μš”μ²­ ν• λ‹Ή. - **Model Serving & Optimization:** μ–‘μžν™”(Quantization), κ°€μ§€μΉ˜κΈ°(Pruning)λ₯Ό 톡해 λͺ¨λΈ 크기λ₯Ό 쀄이고 μΆ”λ‘  속도 κ°œμ„ . - **Vector Database Scaling:** λŒ€κ·œλͺ¨ μž„λ² λ”© λ°μ΄ν„°μ˜ 고속 검색을 μœ„ν•œ 샀딩(Sharding)κ³Ό 인덱싱 μ „λž΅. - **Data Pipeline Efficiency:** 데이터 ν•™μŠ΅ μ‹œ μŠ€ν† λ¦¬μ§€ 병λͺ©μ„ λ°©μ§€ν•˜κΈ° μœ„ν•œ λΆ„μ‚° 파일 μ‹œμŠ€ν…œ ν™œμš©. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌:** μ›Ή μ„œλΉ„μŠ€ μœ„μ£Όμ˜ 전톡적인 μ‹œμŠ€ν…œ λ””μžμΈμ—μ„œ, λͺ¨λΈμ˜ 크기와 μ—°μ‚° λΉ„μš©μ΄ 지배적인 'μ»΄ν“¨νŒ… 집약적' λ””μžμΈμœΌλ‘œ νŒ¨λŸ¬λ‹€μž„ μ „ν™˜. - **μ •μ±… λ³€ν™”:** Antigravity ν”„λ‘œμ νŠΈλŠ” ν–₯ν›„ 수천 λͺ…μ˜ λ™μ‹œ μ‚¬μš©μžλ₯Ό μˆ˜μš©ν•˜κΈ° μœ„ν•΄, μ„œλ²„λ¦¬μŠ€ μΆ”λ‘  μ—”μ§„κ³Ό λΆ„μ‚°ν˜• 벑터 DB ꡬ쑰λ₯Ό κ²°ν•©ν•œ ν™•μž₯ κ°€λŠ₯ν•œ μ•„ν‚€ν…μ²˜λ₯Ό λ‘œλ“œλ§΅μ— λ°˜μ˜ν•¨. ## πŸ”— 지식 μ—°κ²° (Graph) - [[Infrastructure-as-Code-IaC]], [[Parallel-Computing]], Vector-Database, [[MLOps]] - **Raw Source:** 10_Wiki/Topics/AI/System-Design for AI Scale.md