# BERT (Bidirectional Encoder Representations from Transformers) ## ๐Ÿ“Œ Brief Summary BERT๋Š” ๊ตฌ๊ธ€์ด ์ œ์•ˆํ•œ ํ˜์‹ ์ ์ธ ์‚ฌ์ „ ํ•™์Šต ์–ธ์–ด ๋ชจ๋ธ๋กœ, ๋ฌธ์žฅ์˜ ์™ผ์ชฝ๊ณผ ์˜ค๋ฅธ์ชฝ ๋ฌธ๋งฅ์„ ๋™์‹œ์— ๊ณ ๋ คํ•˜์—ฌ ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•˜๋Š” ์–‘๋ฐฉํ–ฅ์„ฑ(Bidirectionality)์ด ํŠน์ง•์ž…๋‹ˆ๋‹ค [1, 2]. ํŠธ๋žœ์Šคํฌ๋จธ(Transformer)์˜ ์ธ์ฝ”๋” ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ, NLP ๋ถ„์•ผ์˜ ์ˆ˜๋งŽ์€ ๋ฒค์น˜๋งˆํฌ ๊ธฐ๋ก์„ ๊ฐฑ์‹ ํ•˜๋ฉฐ ์–ธ์–ด ์ดํ•ด์˜ ์ง€ํ‰์„ ๋„“ํ˜”์Šต๋‹ˆ๋‹ค [1, 3]. ## ๐Ÿ“– Core Content * **ํ•ต์‹ฌ ํ•™์Šต ๋ฉ”์ปค๋‹ˆ์ฆ˜** - **๋งˆ์Šคํฌ ์–ธ์–ด ๋ชจ๋ธ (Masked LM)**: ๋ฌธ์žฅ ๋‚ด ์ผ๋ถ€ ๋‹จ์–ด๋ฅผ ๊ฐ€๋ฆฌ๊ณ  ์ฃผ๋ณ€ ๋ฌธ๋งฅ์„ ํ†ตํ•ด ์›๋ž˜ ๋‹จ์–ด๋ฅผ ๋งžํžˆ๋Š” ๊ณผ์ •์„ ํ†ตํ•ด ๊นŠ์€ ์–ธ์–ด ์ดํ•ด๋ ฅ์„ ๊ฐ–์ถฅ๋‹ˆ๋‹ค [1, 4]. - **๋‹ค์Œ ๋ฌธ์žฅ ์˜ˆ์ธก (NSP)**: ๋‘ ๋ฌธ์žฅ์ด ์„œ๋กœ ์ด์–ด์ง€๋Š” ๊ด€๊ณ„์ธ์ง€ ์˜ˆ์ธกํ•˜์—ฌ ๋ฌธ์žฅ ๊ฐ„์˜ ๋…ผ๋ฆฌ์  ํ๋ฆ„์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค [1]. * **๊ธฐ์ˆ ์  ํŠน์ง•** - **๋ฌธ๋งฅ ์ž„๋ฒ ๋”ฉ (Contextual Embeddings)**: ๋™์ผํ•œ ๋‹จ์–ด๋ผ๋„ ์ฃผ๋ณ€ ๋ฌธ๋งฅ์— ๋”ฐ๋ผ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฒกํ„ฐ ๊ฐ’์„ ๊ฐ€์ ธ ์ค‘์˜์„ฑ ํ•ด๊ฒฐ์— ํƒ์›”ํ•ฉ๋‹ˆ๋‹ค [1, 5]. - **์‚ฌ์ „ ํ•™์Šต ๋ฐ ๋ฏธ์„ธ ์กฐ์ • (Pre-training & Fine-tuning)**: ๋ฐฉ๋Œ€ํ•œ ์ผ๋ฐ˜ ํ…์ŠคํŠธ๋กœ ๋จผ์ € ํ•™์Šตํ•œ ๋’ค, ํŠน์ • ํƒœ์Šคํฌ(์งˆ์˜์‘๋‹ต, ๊ฐ์„ฑ ๋ถ„์„ ๋“ฑ)์— ๋งž์ถฐ ์‚ด์ง๋งŒ ํŠœ๋‹ํ•˜์—ฌ ๊ณ ์„ฑ๋Šฅ์„ ํ™•๋ณดํ•˜๋Š” ์ „์ด ํ•™์Šต ํŒจํ„ด์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค [1, 6]. ## โš–๏ธ Trade-offs & Caveats - **์ถ”๋ก  ์†๋„**: ์–‘๋ฐฉํ–ฅ ๋ฌธ๋งฅ์„ ๋ชจ๋‘ ๊ณ ๋ คํ•ด์•ผ ํ•˜๋ฏ€๋กœ ๋‹จ๋ฐฉํ–ฅ ๋ชจ๋ธ(GPT ๋“ฑ)์— ๋น„ํ•ด ๋ฌธ์žฅ ์ƒ์„ฑ ์†๋„๋Š” ๋А๋ฆด ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ฃผ๋กœ ๋ฌธ์žฅ ๋ถ„๋ฅ˜๋‚˜ ๊ฐœ์ฒด๋ช… ์ธ์‹ ๋“ฑ '์ดํ•ด' ์ค‘์‹ฌ์˜ ์ž‘์—…์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค [1, 7]. - **์ž์› ์˜์กด์„ฑ**: ๋Œ€๊ทœ๋ชจ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ€์ง„ ๋ชจ๋ธ์ด๋ฏ€๋กœ ์‚ฌ์ „ ํ•™์Šต์— ๋ฐฉ๋Œ€ํ•œ ์ปดํ“จํŒ… ์ž์›์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค [1]. ## ๐Ÿ”— Knowledge Connections ### Related Concepts (Auto-Linked) * [[2026-04-30]] * [[Architecture]] * [[Attention Mechanisms]] * [[BERT]] * [[Fine-tuning]] * [[Transfer_Learning]] * [[Transformers]] - **Related Topics**: ํŠธ๋žœ์Šคํฌ๋จธ ์•„ํ‚คํ…์ฒ˜ (Transformer Architecture, ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ (NLP), ์ „์ด ํ•™์Šต (Transfer Learning), ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜ (Attention Mechanisms - **Projects/Contexts**: ๋ฌธ์„œ ์œ ์‚ฌ๋„ ํŒ๋ณ„ ์‹œ์Šคํ…œ, ๊ฐœ์ฒด๋ช… ์ธ์‹ (NER) ๋ชจ๋“ˆ --- *Last updated: 2026-04-30*