Files
2nd/10_Wiki/Topics/AI_and_ML/Instruction-Tuning.md
T
2026-05-10 22:08:15 +09:00

1.1 KiB

id, title, category, status, canonical_id, duplicate_of, aliases, source_trust_level, confidence_score, verification_status, tags, last_reinforced, github_commit
id title category status canonical_id duplicate_of aliases source_trust_level confidence_score verification_status tags last_reinforced github_commit
wiki-2026-0508-instruction-tuning Instruction Tuning 10_Wiki/Topics duplicate wiki-2026-0508-fine-tuning Fine-tuning
instruction tuning
IFT
FLAN
Alpaca
ShareGPT
A 0.96 redirected
duplicate
instruction-tuning
sft
2026-05-10 pending

Instruction Tuning

이 문서는 Fine-tuning 의 specialization 입니다. Canonical 문서로 redirect.

핵심 요약 (instruction-specific)

  • 매 (instruction, response) pair 의 SFT.
  • 매 FLAN (Wei 2021), Alpaca, ShareGPT, Dolly.
  • 매 RLHF / DPO 의 의 의 prerequisite.
  • 매 LIMA (1000 high-quality > 100k noisy) 매 data quality emphasis.
  • 매 modern: 매 multi-turn + tool use + reasoning data.

🔗 Graph

🕓 변경 이력

날짜 변경
2026-05-08 Phase 1
2026-05-10 중복 처리 — canonical 문서로 redirect