--- id: FEAT-ENG-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 1.0 tags: [ai, machine-learning, feature-engineering, data-preprocessing, ml-mastery] last_reinforced: 2026-04-26 --- # Feature Engineering (ν”Όμ²˜ μ—”μ§€λ‹ˆμ–΄λ§) ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "데이터λ₯Ό λͺ¨λΈμ΄ κ°€μž₯ μ΄ν•΄ν•˜κΈ° μ‰¬μš΄ μ–Έμ–΄λ‘œ λ²ˆμ—­ν•˜κ³ , μˆ¨κ²¨μ§„ 톡찰을 숫자둜 κ΅¬μ²΄ν™”ν•˜λΌ" β€” 도메인 지식을 ν™œμš©ν•˜μ—¬ μ›μ‹œ λ°μ΄ν„°λ‘œλΆ€ν„° λͺ¨λΈμ˜ 예츑 μ„±λŠ₯을 κ·ΉλŒ€ν™”ν•  수 μžˆλŠ” μƒˆλ‘œμš΄ νŠΉμ§•(Feature)을 μƒμ„±ν•˜κ±°λ‚˜ κΈ°μ‘΄ νŠΉμ§•μ„ λ³€ν™˜ν•˜λŠ” κ³Όμ •. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) - **μΆ”μΆœλœ νŒ¨ν„΄:** λ³΅μž‘ν•œ ν˜„μ‹€ μ„Έκ³„μ˜ μ›μ‹œ 데이터 속에 μˆ¨κ²¨μ§„ μΈκ³Όκ΄€κ³„λ‚˜ 상관관계λ₯Ό μˆ˜ν•™μ  μ—°μ‚°μ΄λ‚˜ 논리적 가곡을 톡해 λͺ¨λΈμ΄ 즉각 인지할 수 μžˆλŠ” μ‹ ν˜Έ(Signal)둜 μ¦ν­μ‹œν‚€λŠ” 증폭 νŒ¨ν„΄. - **μ£Όμš” 기법:** - **Scaling & Normalization:** λ³€μˆ˜λ“€μ˜ λ‹¨μœ„λ₯Ό ν†΅μΌν•˜μ—¬ νŠΉμ • λ³€μˆ˜μ˜ μ™œκ³‘ λ°©μ§€ (Min-Max, Standard Scaling). - **Encoding:** λ²”μ£Όν˜• 데이터λ₯Ό 수치둜 λ³€ν™˜ (One-hot encoding, Target encoding). - **Binning:** μ—°μ†ν˜• 데이터λ₯Ό λ²”μ£Όλ‘œ λ‚˜λˆ„μ–΄ λ…Έμ΄μ¦ˆ κ°μ†Œ. - **Interaction Features:** 두 개 μ΄μƒμ˜ λ³€μˆ˜λ₯Ό μ‘°ν•©(κ³±μ…ˆ, λ‚˜λˆ—μ…ˆ λ“±)ν•˜μ—¬ μƒˆλ‘œμš΄ 의미 생성. - **Imputation:** 결츑치λ₯Ό 도메인 논리에 맞게 채움. - **의의:** λ”₯λŸ¬λ‹μ΄ νŠΉμ§• μΆ”μΆœμ„ μžλ™ν™”ν•˜κ³  μžˆμ§€λ§Œ, μ—¬μ „νžˆ μ •ν˜• λ°μ΄ν„°λ‚˜ νŠΉμ • λ„λ©”μΈμ—μ„œλŠ” μΈκ°„μ˜ 직관이 λ‹΄κΈ΄ ν”Όμ²˜ μ—”μ§€λ‹ˆμ–΄λ§μ΄ λͺ¨λΈμ˜ ν•œκ³„λ₯Ό λŒνŒŒν•˜λŠ” 핡심 μ—΄μ‡ μž„. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌:** λ³΅μž‘ν•œ μ•Œκ³ λ¦¬μ¦˜μ„ μ°ΎλŠ” 데 μ‹œκ°„μ„ 쏟던 κ΄€ν–‰μ—μ„œ, λ°μ΄ν„°μ˜ 질과 ν‘œν˜„ 방식을 κ°œμ„ ν•˜λŠ” 것이 훨씬 νš¨μœ¨μ μ΄λΌλŠ” 데이터 쀑심 AI(Data-centric AI) κ΄€μ μœΌλ‘œ μ „ν™˜. - **μ •μ±… λ³€ν™”:** Antigravity ν”„λ‘œμ νŠΈλŠ” λ¬Έμ„œ κ°„μ˜ 관련성을 μ‚°μΆœν•  λ•Œ, λ‹¨μˆœ μž„λ² λ”© 거리에 'λ¬Έμ„œ ꡬ쑰적 μœ μ‚¬λ„(헀더 개수, 링크 밀도 λ“±)'λ₯Ό ν”Όμ²˜λ‘œ μΆ”κ°€ν•˜μ—¬ 검색 정밀도λ₯Ό λ†’μž„. ## πŸ”— 지식 μ—°κ²° (Graph) - [[Exploratory-Data-Analysis|Exploratory-Data-Analysis]], [[Dimensionality-Reduction|Dimensionality-Reduction]], Deep-Learning-Foundations, Data-Augmentation-Strategies - **Raw Source:** 10_Wiki/Topics/AI/Feature-Engineering.md