"매 LLM 이 자기 답을 다시 점검 — generate → verify → revise". 매 Dhuliawala 2023 의 CoVe (Chain of Verification), self-consistency, self-refine, reflexion 가 매 family. 매 2026: reasoning model (Claude Opus 4.7 thinking, o3) 이 매 internalized self-verify, 그래도 매 explicit verify pass 가 critical accuracy 추가.
매 핵심
매 형태
Self-consistency (Wang 2022): 매 sample N 개 → majority vote.
Chain of Verification (CoVe): plan → baseline → verify Qs → answer Qs → final.
draft=llm(f"Solve: {task}")for_inrange(3):critique=llm(f"Critique:\n{draft}\nList concrete issues; 'NONE' if perfect.")if"NONE"incritique[:20]:breakdraft=llm(f"Revise based on critique:\n{critique}\n\nDraft:\n{draft}")
msg=anthropic.messages.create(model="claude-opus-4-7",thinking={"type":"enabled","budget_tokens":16000},messages=[{"role":"user","content":hard_problem}],max_tokens=4096,)# 매 internal verify already happens within thinking
언제: 매 high-stakes accuracy, hallucination cost 큼. 매 budget 가 latency 보다 중요.
언제 X: 매 latency-critical (chat UI). 매 task 가 verify 가능한 ground truth 없음 (open creative).
❌ 안티패턴
Self-verify infinite loop: 매 max iter cap 필수.
Same model verify same model on bias: blind spots 공유 → cross-model verify.