--- id: DATA-JIT-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 1.0 tags: [data-engineering, jit-loading, lazy-loading, optimization, deep-learning, performance] last_reinforced: 2026-04-26 --- # Just-in-time Data Loading (์ ์‹œ ๋ฐ์ดํ„ฐ ๋กœ๋”ฉ) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๋ฉ”๋ชจ๋ฆฌ์˜ ํ•œ๊ณ„์— ๊ตด๋ณตํ•˜์ง€ ๋ง๊ณ , ํ•„์š”ํ•œ ์ •๋ณด๋งŒ์„ ๊ฐ€์žฅ ํ•„์š”ํ•œ ์ˆœ๊ฐ„์— ํ๋ฅด๋“ฏ ๊ณต๊ธ‰ํ•˜๋ผ" โ€” ์ „์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์— ๋ฏธ๋ฆฌ ์ ์žฌํ•˜๋Š” ๋Œ€์‹ , ์—ฐ์‚ฐ ์ง์ „์— ํ•„์š”ํ•œ ๋ถ€๋ถ„๋งŒ์„ ๋””์Šคํฌ๋‚˜ ๋„คํŠธ์›Œํฌ๋กœ๋ถ€ํ„ฐ ๋น„๋™๊ธฐ์ ์œผ๋กœ ์ฝ์–ด์™€ ์ฒ˜๋ฆฌํ•˜๋Š” ํšจ์œจ์ ์ธ ๋ฐ์ดํ„ฐ ๊ณต๊ธ‰ ์ „๋žต. ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **์ถ”์ถœ๋œ ํŒจํ„ด:** "Lazy Fetch and Prefetch" โ€” ์‹ค์ œ ์‚ฌ์šฉ ์‹œ์ ๊นŒ์ง€ ๋กœ๋”ฉ์„ ์ง€์—ฐ(Lazy Loading)์‹œํ‚ค๋˜, ์—ฐ์‚ฐ์˜ ๋ณ‘๋ชฉ์„ ๋ง‰๊ธฐ ์œ„ํ•ด ๋‹ค์Œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฏธ๋ฆฌ ์˜ˆ์ธกํ•˜์—ฌ ๋ฐฑ๊ทธ๋ผ์šด๋“œ์—์„œ ๋กœ๋”ฉ(Prefetching)ํ•˜๋Š” ์ด์ค‘ํ™”๋œ ์ตœ์ ํ™” ํŒจํ„ด. - **์ฃผ์š” ๊ธฐ์ˆ  ๋ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ:** - **PyTorch DataLoader:** ๋ฉ€ํ‹ฐ ํ”„๋กœ์„ธ์‹ฑ์„ ํ™œ์šฉํ•˜์—ฌ GPU๊ฐ€ ํ•™์Šตํ•˜๋Š” ๋™์•ˆ CPU๊ฐ€ ๋‹ค์Œ ๋ฐฐ์น˜๋ฅผ ์ค€๋น„. - **Streaming Datasets:** ํ…Œ๋ผ๋ฐ”์ดํŠธ๊ธ‰ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์šด๋กœ๋“œ ์—†์ด ํด๋ผ์šฐ๋“œ์—์„œ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ŠคํŠธ๋ฆฌ๋ฐํ•˜๋ฉฐ ํ•™์Šต. - **Memory Mapping (mmap):** ํŒŒ์ผ์„ ๋ฉ”๋ชจ๋ฆฌ ์ฃผ์†Œ ๊ณต๊ฐ„์— ๋งคํ•‘ํ•˜์—ฌ ํ•„์š”ํ•  ๋•Œ๋งŒ OS๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ์–ด์˜ค๊ฒŒ ํ•จ. - **์˜์˜:** ํ•˜๋“œ์›จ์–ด ์ž์›์˜ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ณ  ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹(LLM ํ•™์Šต ๋“ฑ)์„ ์•ˆ์ •์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•จ. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ๊ณ ๊ฐ€์˜ ๋Œ€์šฉ๋Ÿ‰ ๋ฉ”๋ชจ๋ฆฌ ์ฆ์„ค๋กœ ํ•ด๊ฒฐํ•˜๋˜ ๋ฌธ์ œ๋ฅผ, ์ด์ œ๋Š” ๋˜‘๋˜‘ํ•œ ์†Œํ”„ํŠธ์›จ์–ด ์Šค์ผ€์ค„๋ง๊ณผ ๋น„๋™๊ธฐ I/O ์„ค๊ณ„๋ฅผ ํ†ตํ•ด ๋น„์šฉ ํšจ์œจ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์ง„ํ™”. - **์ •์ฑ… ๋ณ€ํ™”:** Antigravity ํ”„๋กœ์ ํŠธ๋Š” 1,174๊ฐœ์˜ ์ง€์‹ ๋ฒ ์ด์Šค๋ฅผ ์ „์ˆ˜ ์กฐ์‚ฌํ•  ๋•Œ, ์ „์ฒด๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์— ์˜ฌ๋ฆฌ์ง€ ์•Š๊ณ  JIT ๋กœ๋”ฉ ๋ฐฉ์‹์„ ์ ์šฉํ•˜์—ฌ ์‹œ์Šคํ…œ ๋ฆฌ์†Œ์Šค ์ ์œ ์œจ์„ 10% ๋ฏธ๋งŒ์œผ๋กœ ์œ ์ง€ํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - [[Inference-Optimization]], System-Design-for-AI-Scale, Deep-Learning-Foundations, Cloud-Computing-Foundations - **Raw Source:** 10_Wiki/Topics/AI/Just-in-time-Data-Loading.md