--- id: P-REINFORCE-AUTO-PEFT-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 0.98 tags: [auto-reinforced, llm, fine-tuning, efficiency, adapters] last_reinforced: 2026-04-20 --- # [[PEFT (Parameter-Efficient Fine-Tuning)|PEFT (Parameter-Efficient Fine-Tuning)]] ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "์ „๋ด‡๋Œ€๋ฅผ ๋‹ค ๋ฐ”๊พธ๋Š” ๋Œ€์‹  ์ „๊ตฌ๋งŒ ๋ฐ”๊พผ๋‹ค: ๊ฑฐ๋Œ€ ๋ชจ๋ธ์˜ ์ „์ฒด ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฑด๋“œ๋ฆฌ์ง€ ์•Š๊ณ , ๊ทนํžˆ ์ผ๋ถ€(1% ๋ฏธ๋งŒ)๋งŒ ํ•™์Šต์‹œ์ผœ ํ•˜๋“œ์›จ์–ด ๋ถ€๋‹ด ์—†์ด ์ „๋ฌธ ์ง€์‹์„ ์ฃผ์ž…ํ•˜๋Š” ํšจ์œจ ๊ทน๋Œ€ํ™” ๊ธฐ์ˆ ." ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) ๋งค๊ฐœ๋ณ€์ˆ˜ ํšจ์œจ์  ๋ฏธ์„ธ ์กฐ์ •(PEFT)์€ ๊ฑฐ๋Œ€ ์–ธ์–ด ๋ชจ๋ธ(LLM)์„ ํŠน์ • ์ž‘์—…์— ๋งž์ถฐ ์ตœ์ ํ™”ํ•  ๋•Œ, ์ „์ฒด ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋Œ€์‹  ์†Œ๋Ÿ‰์˜ ์ถ”๊ฐ€ ํŒŒ๋ผ๋ฏธํ„ฐ๋งŒ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•๋ก ์ž…๋‹ˆ๋‹ค. 1. **์ฃผ์š” ๊ธฐ๋ฒ•**: * **LoRA (Low-Rank Adaptation)**: ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ์˜ ๋ณ€ํ™”๋Ÿ‰์„ ๋‚ฎ์€ ์ฐจ์›์˜ ๋‘ ํ–‰๋ ฌ(A, B)๋กœ ๋ถ„ํ•ดํ•˜์—ฌ ํ•™์Šต. ๊ฐ€์žฅ ๋Œ€์ค‘์ ์ธ ๊ธฐ๋ฒ•์œผ๋กœ ์—ฐ์‚ฐ๋Ÿ‰๊ณผ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํš๊ธฐ์ ์œผ๋กœ ์ ˆ๊ฐ. * **Adapters**: ๊ธฐ์กด ๋ชจ๋ธ ๋ ˆ์ด์–ด ์‚ฌ์ด์— ์ž‘์€ ์‹ ๊ฒฝ๋ง(Adapter)์„ ๋ผ์›Œ ๋„ฃ์–ด ํ•ด๋‹น ๋ถ€๋ถ„๋งŒ ํ•™์Šต. * **Prompt Tuning / Prefix Tuning**: ๋ชจ๋ธ ์ž…๋ ฅ ์•ž๋‹จ์— ํ•™์Šต ๊ฐ€๋Šฅํ•œ ๊ฐ€์ƒ์˜ '์†Œํ”„ํŠธ ํ”„๋กฌํ”„ํŠธ' ๋ฒกํ„ฐ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ํŠœ๋‹. 2. **ํ•ต์‹ฌ ์ด์ **: * **GPU ๋ฉ”๋ชจ๋ฆฌ ์ ˆ์•ฝ**: ํ•˜์ด์—”๋“œ ์„œ๋ฒ„ ์—†์ด๋„ ์†Œ๋น„์ž์šฉ GPU์—์„œ ๊ฑฐ๋Œ€ ๋ชจ๋ธ ํŠœ๋‹ ๊ฐ€๋Šฅ. * **ํŒŒ๋ผ๋ฏธํ„ฐ ์‚ฌ์ผ๋กœ ๋ฐฉ์ง€**: ๊ฐ ์ž‘์—…๋งˆ๋‹ค ๊ฑฐ๋Œ€ ๋ชจ๋ธ์„ ํ†ต์งธ๋กœ ์ €์žฅํ•  ํ•„์š” ์—†์ด, ์ž‘์€ PEFT ๋ชจ๋“ˆ(์ฒดํฌํฌ์ธํŠธ)๋งŒ ์ €์žฅํ•˜์—ฌ ๊ต์ฒดํ•˜๋ฉฐ ์‚ฌ์šฉ ๊ฐ€๋Šฅ. * **Catastrophic Forgetting ๋ฐฉ์ง€**: ์›๋ณธ ๊ฐ€์ค‘์น˜๊ฐ€ ๊ณ ์ •๋˜๋ฏ€๋กœ ๋ชจ๋ธ์˜ ๊ธฐ๋ฐ˜ ์ง€์‹์ด ๋ฌด๋„ˆ์ง€์ง€ ์•Š์Œ. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ**: ์ดˆ๊ธฐ์—๋Š” "์ผ๋ถ€๋งŒ ํ•™์Šตํ•˜๋ฉด ์„ฑ๋Šฅ์ด ๋–จ์–ด์งˆ ๊ฒƒ"์ด๋ผ๋Š” ์šฐ๋ ค๊ฐ€ ์žˆ์—ˆ์œผ๋‚˜, ์—ฐ๊ตฌ ๊ฒฐ๊ณผ ์ „์ฒด ํŠœ๋‹(Full Fine-tuning)๊ณผ ๋Œ€๋“ฑํ•˜๊ฑฐ๋‚˜ ์˜คํžˆ๋ ค ํŠน์ • ์ž‘์—…์—์„œ๋Š” ๊ณผ์ ํ•ฉ์„ ๋ง‰์•„ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ƒ„์ด ์ฆ๋ช…๋จ. - **์ •์ฑ… ๋ณ€ํ™”(RL Update)**: ๊ธฐ์—… ๋ณด์•ˆ ์ •์ฑ… ์ƒ 'ํด๋ผ์šฐ๋“œ API'๋ฅผ ์“ฐ๊ธฐ ํž˜๋“  ํ™˜๊ฒฝ์—์„œ, ์‚ฌ๋‚ด ๋ฐ์ดํ„ฐ๋กœ ๋กœ์ปฌ ๋ชจ๋ธ์„ ์•ˆ์ „ํ•˜๊ณ  ์ €๋น„์šฉ์œผ๋กœ ํŠœ๋‹ํ•˜๋Š” 'On-premise PEFT'๊ฐ€ ๋ฐ์ดํ„ฐ ๊ฑฐ๋ฒ„๋„Œ์Šค์˜ ํ•ต์‹ฌ ์ „๋žต์œผ๋กœ ๋ถ€์ƒํ•จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - **Related**: [[SFT (Supervised Fine-Tuning)|SFT (Supervised Fine-Tuning)]], Foundational Models, [[Transfer Learning|Transfer Learning]], [[Large Language Models (LLM)|Large Language Models (LLM)]] - **Modern Tech/Tools**: HuggingFace PEFT library, LoRA, QLoRA. ---