[G1-Sync] Manual knowledge update

2026-05-09 21:08:02 +09:00
parent f0befc887a
commit 93ec7e9056
363 changed files with 68333 additions and 64 deletions
@@ -0,0 +1,178 @@
+---
+id: ai-fine-tuning-vs-prompting
+title: Fine-tuning vs Prompting — 결정 기준
+category: Coding
+status: draft
+source_trust_level: B
+verification_status: conceptual
+created_at: 2026-05-09
+updated_at: 2026-05-09
+tags: [ai, llm, fine-tuning, lora, vibe-coding]
+tech_stack: { language: "TS / Python", applicable_to: ["Backend"] }
+applied_in: []
+aliases: [fine-tuning, LoRA, RAG vs FT, distillation, prompt engineering]
+---
+
+# Fine-tuning vs Prompting
+
+> **거의 항상 prompting (+ RAG) 먼저**. Fine-tuning = 좁은 도메인 / 일관 스타일 / latency / cost 최적화. LoRA 가 cheap. **새로운 지식 = RAG, 새로운 스타일 / 형식 = fine-tune**.
+
+## 📖 핵심 개념
+- Prompt: zero-shot / few-shot.
+- RAG: 외부 지식 inject.
+- Fine-tune (full): 모든 weights — 비싸.
+- LoRA / QLoRA: 적은 파라미터만 학습 — cheap.
+- Distillation: 큰 모델 → 작은 모델 모방.
+
+## 💻 코드 패턴
+
+### 결정 트리
+```
+새 지식 (사실) 필요?
+  YES → RAG
+  NO → 다음
+
+스타일 / 형식 / tone 일관 필요?
+  YES → fine-tune (LoRA)
+  NO → 다음
+
+Latency / cost 줄여야?
+  YES → fine-tune 작은 모델 + distillation
+  NO → prompt 만
+```
+
+### Prompt → 충분한가 검증
+```ts
+// 100개 test case
+const dataset = loadEvalSet();
+const score = await evaluate(promptModel, dataset);
+console.log('Pass:', score, '%'); // 80% 미만 → fine-tune 후보
+```
+
+### LoRA fine-tune (Hugging Face PEFT)
+```python
+from peft import LoraConfig, get_peft_model
+from transformers import AutoModelForCausalLM, TrainingArguments
+from trl import SFTTrainer
+
+base = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-3.2-8B-Instruct')
+
+lora = LoraConfig(
+    r=16, lora_alpha=32, target_modules=['q_proj', 'v_proj'],
+    lora_dropout=0.05, bias='none', task_type='CAUSAL_LM',
+)
+model = get_peft_model(base, lora)
+
+trainer = SFTTrainer(
+    model=model,
+    train_dataset=dataset,
+    args=TrainingArguments(output_dir='./out', num_train_epochs=3, learning_rate=2e-4, per_device_train_batch_size=4),
+    max_seq_length=2048,
+)
+trainer.train()
+trainer.save_model('./lora-out')
+```
+
+→ 1000-10000 examples 면 충분. 1 GPU + 몇 시간.
+
+### OpenAI fine-tune (managed)
+```ts
+// 1. Format JSONL
+// {"messages":[{"role":"system","content":"..."},{"role":"user","content":"..."},{"role":"assistant","content":"..."}]}
+
+// 2. Upload
+const file = await openai.files.create({
+  file: fs.createReadStream('train.jsonl'),
+  purpose: 'fine-tune',
+});
+
+// 3. Job
+const job = await openai.fineTuning.jobs.create({
+  training_file: file.id,
+  model: 'gpt-4o-mini-2024-07-18',
+  hyperparameters: { n_epochs: 3 },
+});
+
+// 4. Wait + use
+const completed = await waitForJob(job.id);
+const model = completed.fine_tuned_model;
+
+// 5. 사용
+await openai.chat.completions.create({ model, messages });
+```
+
+### 데이터 (가장 중요)
+```jsonl
+{"messages":[{"role":"system","content":"You are a customer support bot for Acme."},
+{"role":"user","content":"How do I reset my password?"},
+{"role":"assistant","content":"To reset: 1. Go to /forgot-password. 2. Enter your email. 3. Check inbox. We never email plain passwords."}]}
+```
+
+```
+규모:
+- 50-100 examples = 시작 (작은 작업)
+- 500-1000 = 좋은 결과
+- 10000+ = 큰 task (분류 등)
+```
+
+품질 > 양. 일관성 critical.
+
+### 평가 (fine-tune 전후 비교)
+```ts
+const before = await evaluate(baseModel, evalSet);
+const after = await evaluate(fineTunedModel, evalSet);
+console.log('Before:', before, 'After:', after);
+```
+
+→ 향상 없으면 도입 X.
+
+### Distillation (큰 → 작은)
+```
+GPT-4o (큰) 가 답을 생성 → 그 데이터로 GPT-4o-mini (작은) fine-tune
+→ 작은 모델이 비슷한 정확도, 10x cheap / fast
+```
+
+### When NOT to fine-tune
+- 사실 / 지식 추가 → RAG.
+- 자주 변경 → prompt 가 빠름.
+- Few-shot 으로 충분.
+- 데이터 적음 (<50).
+- Eval 안 향상.
+
+### Cost 비교 (대략)
+```
+Prompt:        $0 dev cost, $$ per token (큰 prompt = 비쌈)
+RAG:           $$ infra (vector DB) + $ inference
+Fine-tune:     $$$ training 1회 (~$50-500) + $ inference (cheaper than 큰 모델)
+LoRA self:     $ GPU (~$10-50)
+```
+
+## 🤔 의사결정 기준
+| 목적 | 추천 |
+|---|---|
+| 새 사실 / 지식 | RAG |
+| 일관 스타일 / 톤 | Fine-tune |
+| 특정 형식 (JSON) | Prompt + structured output |
+| Latency 줄임 | Fine-tune small + distill |
+| Cost 줄임 | Distill 또는 Local |
+| 빠른 prototype | Prompt only |
+
+## ❌ 안티패턴
+- **Fine-tune 먼저 시도**: prompt + RAG 충분한 경우 비싼 우회.
+- **Bad data 학습**: garbage in, out.
+- **Eval 없이 launch**: 성능 모름.
+- **너무 적은 데이터 (10개)**: overfit.
+- **Train / test 같은 데이터**: 거짓 점수.
+- **System prompt 가 train data 와 다름**: prod 동작 차이.
+- **Cloud + provider lock-in**: switch 어려움.
+
+## 🤖 LLM 활용 힌트
+- Prompt + RAG → 80% case 해결.
+- Fine-tune = 마지막 카드, 데이터 + eval 갖추고.
+- LoRA cheap — 시도 가치.
+
+## 🔗 관련 문서
+- [[AI_Prompt_Engineering_Patterns]]
+- [[AI_RAG_Pattern_Basics]]
+- [[AI_LLM_Eval_Patterns]]
+- [[AI_Local_LLM_Inference]]