--- id: NLP-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 1.0 tags: [ai, nlp, linguistics, llm, text-analysis] last_reinforced: 2026-04-26 --- # Natural Language Processing (NLP, ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "์ธ๊ฐ„์˜ ์–ธ์–ด๋ฅผ ์ปดํ“จํ„ฐ์˜ ์–ธ์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ณ  ์ดํ•ดํ•˜๊ฒŒ ํ•˜๋ผ" โ€” ํ…์ŠคํŠธ์™€ ์Œ์„ฑ ๋“ฑ ์ธ๊ฐ„์˜ ์ž์—ฐ ์–ธ์–ด๋ฅผ ๊ธฐ๊ณ„๊ฐ€ ์ฒ˜๋ฆฌ, ๋ถ„์„, ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋งŒ๋“œ๋Š” ์ปดํ“จํ„ฐ ๊ณผํ•™๊ณผ ์–ธ์–ดํ•™์˜ ์œตํ•ฉ ๋ถ„์•ผ. ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) - **์ถ”์ถœ๋œ ํŒจํ„ด:** ๋น„์ •ํ˜• ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์—์„œ ์˜๋ฏธ์  ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ (Feature Extraction), ๋ฌธ๋งฅ๊ณผ ๋ฌธ๋ฒ•์„ ํŒŒ์•…ํ•˜์—ฌ ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ ์ˆ˜์ค€์˜ ๋Œ€ํ™”์™€ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ธ์ง€ ์ฒ˜๋ฆฌ ํŒจํ„ด. - **์„ธ๋ถ€ ๋‚ด์šฉ:** - **Tokenization & Embedding:** ํ…์ŠคํŠธ๋ฅผ ์ตœ์†Œ ๋‹จ์œ„๋กœ ์ชผ๊ฐœ๊ณ  ์ด๋ฅผ ๊ณ ์ฐจ์› ๋ฒกํ„ฐ ๊ณต๊ฐ„์˜ ์ˆซ์ž๋กœ ๋ณ€ํ™˜. - **Syntactic & Semantic Analysis:** ๋ฌธ์žฅ์˜ ๊ตฌ์กฐ(๋ฌธ๋ฒ•)์™€ ์‹ค์งˆ์ ์ธ ์˜๋ฏธ(์ฝ˜ํ…์ธ )๋ฅผ ๋ถ„์„. - **Key Tasks:** ๊ธฐ๊ณ„ ๋ฒˆ์—ญ, ๊ฐ์„ฑ ๋ถ„์„, ์งˆ๋ฌธ ๋‹ต๋ณ€(QA), ๊ฐœ์ฒด๋ช… ์ธ์‹(NER), ์š”์•ฝ. - **Evolution:** ๊ทœ์น™ ๊ธฐ๋ฐ˜(Rule-based) -> ํ†ต๊ณ„ ๊ธฐ๋ฐ˜(Statistical) -> ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜(RNN/LSTM) -> ํŠธ๋žœ์Šคํฌ๋จธ ๊ธฐ๋ฐ˜(LLM)์œผ๋กœ ์ง„ํ™”. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ๋ฌธ๋ฒ• ๊ทœ์น™์„ ์ผ์ผ์ด ๊ฐ€๋ฅด์น˜๋˜ ๋ฐฉ์‹์—์„œ, ๋ฐฉ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ์–ธ์–ด์˜ ํ†ต๊ณ„์  ๊ตฌ์กฐ์™€ ์ง€์‹์„ ์Šค์Šค๋กœ ํ•™์Šตํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ •์ฐฉ. - **์ •์ฑ… ๋ณ€ํ™”:** Antigravity ํ”„๋กœ์ ํŠธ๋Š” NLP ๊ธฐ์ˆ ์„ ํ™œ์šฉํ•˜์—ฌ ๋ฐฉ๋Œ€ํ•œ ์›์‹œ ๋ฐ์ดํ„ฐ(00_Raw)๋ฅผ ์ž๋™์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๊ณ  ํ•ต์‹ฌ ์ง€์‹์„ ์ถ”์ถœํ•˜์—ฌ ์œ„ํ‚คํ™”ํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - [[LLM|LLM]], Word-Embeddings, [[Transformer-Architecture|Transformer-Architecture]], Information-Extraction - **Raw Source:** 10_Wiki/Topics/AI/Natural-Language-Processing.md