--- id: DL-ACT-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 1.0 tags: [ai, deep-learning, activation-function, leaky-relu, relu, neural-networks] last_reinforced: 2026-04-26 --- # Leaky ReLU and Activations (Leaky ReLU와 ν™œμ„±ν™” ν•¨μˆ˜) ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "μ™„λ²½ν•œ 차단(Zero)보닀 λ―Έμ„Έν•œ κ°€λŠ₯μ„±(Small Slope)을 남겨, μž λ“  λ‰΄λŸ°μ„ 깨우고 ν•™μŠ΅μ˜ 흐름을 μœ μ§€ν•˜λΌ" β€” 음수 μž…λ ₯에 λŒ€ν•΄ 0을 좜λ ₯ν•˜λŠ” ReLU의 ν•œκ³„λ₯Ό κ·Ήλ³΅ν•˜κΈ° μœ„ν•΄, μ•„μ£Ό μž‘μ€ 기울기λ₯Ό ν—ˆμš©ν•˜μ—¬ 정보 손싀을 막고 기울기 μ†Œμ‹€ 문제λ₯Ό μ™„ν™”ν•˜λŠ” ν™œμ„±ν™” ν•¨μˆ˜. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) - **μΆ”μΆœλœ νŒ¨ν„΄:** "Non-linear Signal Gating" β€” μž…λ ₯ μ‹ ν˜Έμ— λΉ„μ„ ν˜•μ„±μ„ λΆ€μ—¬ν•˜μ—¬ 신경망이 λ³΅μž‘ν•œ ν•¨μˆ˜λ₯Ό ν•™μŠ΅ν•  수 있게 ν•˜λ˜, ν•™μŠ΅ κ³Όμ •μ—μ„œ νŠΉμ • κ°€μ€‘μΉ˜κ°€ μ—…λ°μ΄νŠΈλ˜μ§€ μ•ŠλŠ” 'Dead Neuron' ν˜„μƒμ„ λ°©μ§€ν•˜λŠ” 방어적 ν™œμ„±ν™” νŒ¨ν„΄. - **μ£Όμš” ν•¨μˆ˜ 비ꡐ:** - **ReLU:** λ‹¨μˆœν•˜κ³  λΉ λ₯΄μ§€λ§Œ 음수 μ˜μ—­μ—μ„œ 정보 μœ μ‹€(Dying ReLU). - **Leaky ReLU:** $f(x) = \max(0.01x, x)$ ν˜•νƒœλ‘œ μŒμˆ˜μ—μ„œλ„ ν•™μŠ΅ κ°€λŠ₯. - **ELU / SELU:** μ§€μˆ˜ ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ 평균 ν™œμ„±ν™”λ₯Ό 0에 κ°€κΉκ²Œ 쑰절. - **GELU:** κ°€μš°μ‹œμ•ˆ 뢄포λ₯Ό ν™œμš©ν•˜μ—¬ 트랜슀포머 λͺ¨λΈμ—μ„œ 주둜 μ‚¬μš©. - **의의:** 신경망이 측을 κ±°λ“­ν•˜λ©° κΉŠμ–΄μ§ˆ λ•Œ, μ‹ ν˜Έκ°€ λŠκΈ°μ§€ μ•Šκ³  λκΉŒμ§€ μ „λ‹¬λ˜λ„λ‘ ν•˜λŠ” μ—λ„ˆμ§€ 곡급원 μ—­ν• . ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌:** λ‹¨μˆœν•œ ReLUκ°€ μ΅œκ°•μ΄λΌλŠ” λ―ΏμŒμ—μ„œ λ²—μ–΄λ‚˜, 졜근의 μ΄ˆκ±°λŒ€ λͺ¨λΈ(LLM)듀은 λΆ€λ“œλŸ¬μš΄ 곑선 ν˜•νƒœμ˜ GELUλ‚˜ Swish κ³„μ—΄μ˜ ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ 더 μ •κ΅ν•œ ν•™μŠ΅ μ„±λŠ₯을 확보함. - **μ •μ±… λ³€ν™”:** Antigravity ν”„λ‘œμ νŠΈλŠ” μ»€μŠ€ν…€ 신경망 섀계 μ‹œ κΈ°λ³Έ ν™œμ„±ν™” ν•¨μˆ˜λ‘œ Leaky ReLU λ˜λŠ” GELUλ₯Ό μ‚¬μš©ν•˜μ—¬ λͺ¨λΈμ˜ 수렴 속도와 μ„±λŠ₯을 λ™μ‹œμ— 관리함. ## πŸ”— 지식 μ—°κ²° (Graph) - Deep-Learning-Foundations, Backpropagation-Foundations, Weight-Initialization-Strategies, Transformer-Architecture-Foundations - **Raw Source:** 10_Wiki/Topics/AI/Leaky-ReLU-and-Activations.md