--- id: DATA-AUG-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 1.0 tags: [ai, [[Deep-Learning|Deep-Learning]], [[Computer-Vision|Computer-Vision]], nlp, data-augmentation, pre[[Processing|Processing]]] last_reinforced: 2026-04-26 --- # Data Augmentation Strategies (데이터 증강 μ „λž΅) ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "λ°μ΄ν„°μ˜ 양을 늘릴 수 μ—†λ‹€λ©΄, λ°μ΄ν„°μ˜ 'λͺ¨μŠ΅'을 λ‹€μ–‘ν•˜κ²Œ λ³€μ£Όν•˜λΌ" β€” κΈ°μ‘΄ ν•™μŠ΅ 데이터λ₯Ό μˆ˜ν•™μ μœΌλ‘œ λ³€ν˜•ν•˜μ—¬ λ°μ΄ν„°μ…‹μ˜ 규λͺ¨λ₯Ό κ°€μƒμœΌλ‘œ 늘리고, λͺ¨λΈμ΄ λ°μ΄ν„°μ˜ 본질적인 λΆˆλ³€ νŠΉμ§•μ„ ν•™μŠ΅ν•˜κ²Œ ν•˜μ—¬ μΌλ°˜ν™” μ„±λŠ₯을 λ†’μ΄λŠ” 기법. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) - **μΆ”μΆœλœ νŒ¨ν„΄:** 원본 데이터가 κ°€μ§„ 핡심 μ •λ³΄λŠ” μœ μ§€ν•˜λ©΄μ„œ λ…Έμ΄μ¦ˆλ‚˜ λ³€ν˜•μ„ κ°€ν•΄, λͺ¨λΈμ΄ μ‚¬μ†Œν•œ 변화에 νœ˜λ‘˜λ¦¬μ§€ μ•ŠλŠ” 강건함([[Robustness|Robustness]])을 κ°–μΆ”κ²Œ ν•˜λŠ” λ³€μ‘° νŒ¨ν„΄. - **μ£Όμš” μ „λž΅:** - **[[Computer Vision|Computer Vision]]:** 이미지 νšŒμ „, λ°˜μ „(Flip), 자λ₯΄κΈ°(Crop), 색상 λ³€μ‘°, Mixup(두 이미지λ₯Ό μ„žμŒ), Cutout(일뢀 κ°€λ¦Ό). - **NLP:** λ™μ˜μ–΄ ꡐ체(SR), λ¬΄μž‘μœ„ μ‚­μ œ/μ‚½μž…, μ—­λ²ˆμ—­(Back Translation: λ‹€λ₯Έ μ–Έμ–΄λ‘œ λ²ˆμ—­ ν›„ λ‹€μ‹œ 볡원). - **Audio:** 속도 쑰절, ν”ΌμΉ˜ λ³€κ²½, λ…Έμ΄μ¦ˆ μΆ”κ°€. - **Generative Augmentation:** GANμ΄λ‚˜ Diffusion λͺ¨λΈμ„ μ΄μš©ν•΄ μƒˆλ‘œμš΄ κ°€μ§œ 데이터λ₯Ό μƒμ„±ν•˜μ—¬ ν•™μŠ΅μ— ν™œμš©. - **의의:** 과적합([[Overfitting|Overfitting]])을 λ°©μ§€ν•˜κ³  적은 λ°μ΄ν„°λ‘œλ„ κ³ μ„±λŠ₯ λͺ¨λΈμ„ ꡬ좕할 수 있게 함. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌:** μ‚¬λžŒμ΄ 직접 λ³€ν˜• κ·œμΉ™μ„ μ •μ˜ν•˜λ˜ λ°©μ‹μ—μ„œ, μ΅œκ·Όμ—λŠ” λͺ¨λΈμ΄ 슀슀둜 졜적의 증강 쑰합을 μ°ΎλŠ” AutoAugment 기술둜 λ°œμ „. - **μ •μ±… λ³€ν™”:** Antigravity ν”„λ‘œμ νŠΈλŠ” λΆ€μ‘±ν•œ ν•œκ΅­μ–΄ μ „λ¬Έ μš©μ–΄ 데이터λ₯Ό λ³΄κ°•ν•˜κΈ° μœ„ν•΄ μ—­λ²ˆμ—­ 기반의 데이터 증강 μ „λž΅μ„ μ‚¬μš©ν•˜μ—¬ NLP μ—μ΄μ „νŠΈμ˜ λ¬Έν•΄λ ₯을 λ†’μž„. ## πŸ”— 지식 μ—°κ²° (Graph) - Computer-Vision-[[Mastery|Mastery]], NLP, [[Regularization-Techniques|Regularization-Techniques]], [[Generative-Adversarial-Networks|Generative-Adversarial-Networks]]-GAN - **Raw Source:** 10_Wiki/Topics/AI/Data-Augmentation Strategies.md