--- id: GRAD-EXPL-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 1.0 tags: [ai, [[Deep-Learning|Deep-Learning]], neural-networks, [[Optimization|Optimization]], exploding-gradient] last_reinforced: 2026-04-26 --- # Exploding Gradient Problem (๊ธฐ์šธ๊ธฐ ํญ์ฃผ ๋ฌธ์ œ) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๊ธฐ์šธ๊ธฐ๊ฐ€ ๋ˆˆ๋ฉ์ด์ฒ˜๋Ÿผ ์ปค์ ธ ํ•™์Šต์ด ํŒŒ๊ดด๋˜์ง€ ์•Š๋„๋ก ์ œ์–ดํ•˜๋ผ" โ€” ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต ๊ณผ์ •์—์„œ ์—ญ์ „ํŒŒ๋˜๋Š” ๊ธฐ์šธ๊ธฐ ๊ฐ’์ด ์ธต์„ ๊ฑฐ๋“ญํ• ์ˆ˜๋ก ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ์ปค์ ธ ๊ฐ€์ค‘์น˜๊ฐ€ ๋งค์šฐ ํฐ ๊ฐ’์œผ๋กœ ์—…๋ฐ์ดํŠธ๋˜๊ณ , ๊ฒฐ๊ตญ ํ•™์Šต์ด ๋ถˆ์•ˆ์ •ํ•ด์ง€๊ฑฐ๋‚˜ ์‹คํŒจ(NaN ๋ฐœ์ƒ)ํ•˜๋Š” ํ˜„์ƒ. ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **์ถ”์ถœ๋œ ํŒจํ„ด:** ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ์˜ ๊ณ ์œ ๊ฐ’์ด 1๋ณด๋‹ค ํด ๋•Œ, ์—ฐ์‡„ ๋ฒ•์น™์— ์˜ํ•ด ๊ธฐ์šธ๊ธฐ๊ฐ€ ๊ณฑํ•ด์ง€๋ฉฐ ๋ฌดํ•œํžˆ ์ฆํญ๋˜๋Š” ์ˆ˜์น˜์  ๋ถˆ์•ˆ์ • ํŒจํ„ด. ์ฃผ๋กœ ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง(RNN)์ด๋‚˜ ๋งค์šฐ ๊นŠ์€ ์‹ ๊ฒฝ๋ง์—์„œ ๋ฐœ์ƒ. - **ํ•ด๊ฒฐ ๊ธฐ๋ฒ•:** - **Gradient [[CLIP|CLIP]]ping:** ๊ธฐ์šธ๊ธฐ๊ฐ€ ์ผ์ • ์ž„๊ณ„๊ฐ’(Threshold)์„ ๋„˜์ง€ ์•Š๋„๋ก ๊ฐ•์ œ๋กœ ์ž๋ฆ„ (๊ฐ€์žฅ ์ง์ ‘์ ์ธ ํ•ด๊ฒฐ์ฑ…). - **Weight Initialization:** ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐ๊ฐ’์„ ์ ์ ˆํžˆ ์„ค์ • (Xavier, He ์ดˆ๊ธฐํ™”). - **Batch [[Normalization|Normalization]]:** ๊ฐ ์ธต์˜ ์ถœ๋ ฅ์„ ์ •๊ทœํ™”ํ•˜์—ฌ ๊ฐ’์˜ ๋ฒ”์œ„๋ฅผ ์ œํ•œ. - **[[LSTM|LSTM]] / GRU:** ๊ฒŒ์ดํŠธ ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ์ •๋ณด์˜ ํ๋ฆ„์„ ์กฐ์ ˆํ•˜์—ฌ RNN์˜ ๊ณ ์งˆ์ ์ธ ๋ฌธ์ œ ์™„ํ™”. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ๊ธฐ์šธ๊ธฐ ์†Œ์‹ค(Vanishing Gradient) ๋ฌธ์ œ์— ๊ฐ€๋ ค์ ธ ๋œ ์ฃผ๋ชฉ๋ฐ›์•˜์œผ๋‚˜, ์ดˆ๊ฑฐ๋Œ€ ๋ชจ๋ธ ํ•™์Šต ์‹œ ์ˆ˜์น˜์  ์•ˆ์ •์„ฑ์„ ๊นจ๋œจ๋ฆฌ๋Š” ์ฃผ์š” ์›์ธ์œผ๋กœ ๋ถ€๊ฐ๋จ. - **์ •์ฑ… ๋ณ€ํ™”:** Antigravity ํ”„๋กœ์ ํŠธ๋Š” ๋Œ€๊ทœ๋ชจ ์ง€์‹ ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ ํ•™์Šต ์‹œ, Gradient Clipping์„ ๊ธฐ๋ณธ์œผ๋กœ ์„ค์ •ํ•˜์—ฌ ํ•™์Šต ์ดˆ๊ธฐ ๋‹จ๊ณ„์˜ ๋ฐœ์‚ฐ์„ ๋ฐฉ์ง€ํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - [[Backpropagation|Backpropagation]], Neural-Networks-Foundations, [[Regularization-Techniques|Regularization-Techniques]], Deep-Learning-Foundations - **Raw Source:** 10_Wiki/Topics/AI/Exploding-Gradient Problem.md