Files
2nd/10_Wiki/Topics/Coding/AI_Self_Reflection_Deep.md
T
2026-05-10 22:08:15 +09:00

8.1 KiB
Raw Blame History

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
ai-self-reflection-deep Self-Reflection — agent 가 자기 답 평가 Coding draft B conceptual 2026-05-09 2026-05-09
ai
agents
reflection
vibe-coding
language applicable_to
TS / Python
AI
self-reflection
critique
ReAct
Reflexion
chain-of-thought
self-consistency

Self-Reflection

Agent 가 출력 후 "이게 좋아?" 자기 검토. Critique → revise loop. Reflexion / self-consistency / verifier. Cost ↑, quality ↑.

📖 핵심 개념

  • 1 shot answer = bug.
  • Critique 가 자기 자신.
  • Iterate (revise) 가 quality ↑.
  • Cost (N x LLM) 가 가치 가져야.

💻 코드 패턴

Self-critique

async function withReflection(task: string, maxRounds = 3) {
  let solution = await llm.complete({ prompt: task });
  
  for (let i = 0; i < maxRounds; i++) {
    const critique = await llm.complete({
      system: 'Find issues with this answer. If perfect, say "OK".',
      prompt: `Task: ${task}\n\nAnswer: ${solution}`,
    });
    
    if (critique.toLowerCase().startsWith('ok')) return solution;
    
    solution = await llm.complete({
      system: 'Improve based on feedback.',
      prompt: `Task: ${task}\n\nAnswer: ${solution}\n\nIssues: ${critique}`,
    });
  }
  
  return solution;
}

→ 매 round = 2x LLM call.

Reflexion (memory + reflection)

class Reflexion:
    def __init__(self):
        self.memory = []  # 과거 reflection
    
    def solve(self, task):
        for attempt in range(3):
            solution = self.llm(task, context=self.memory)
            
            # Test
            ok = self.test(solution)
            if ok:
                return solution
            
            # Reflect on failure
            reflection = self.llm(f'Why did this fail? task={task}, solution={solution}')
            self.memory.append(reflection)
        
        return solution

→ 다음 attempt 가 reflection 활용.

Self-consistency (voting)

async function selfConsistency(task: string, n: number = 5) {
  const answers = await Promise.all(
    Array(n).fill(0).map(() => llm.complete({ prompt: task, temperature: 0.7 }))
  );
  
  // Majority vote
  const counts = new Map<string, number>();
  for (const a of answers) {
    counts.set(a, (counts.get(a) ?? 0) + 1);
  }
  
  return [...counts.entries()].sort((a, b) => b[1] - a[1])[0][0];
}

→ 5 answer + vote. Hallucination 줄임. 5x cost.

Chain of thought (CoT)

const prompt = `
Question: ${q}

Think step by step:
1. ...
2. ...
3. ...

Answer: ...
`;

→ Implicit reflection. Reasoning 가 visible.

Tree of Thought (ToT)

Q: 24 game (사칙 연산 → 24)

Tree:
- Branch 1: 8 + 16 = 24? — 8 어디서?
  - Sub: 6 + 2 = 8 → 8 + 16 = 24 ✓
- Branch 2: 12 × 2 = 24
  - ...

→ 매 step 가 가능 path 다 탐색 + 평가 + 가장 좋은 선택.

Verifier (separate model)

async function verify(solution: string, task: string) {
  const r = await verifierLLM.complete({
    system: 'Rate solution 0-10. Return JSON: {score, reason}.',
    prompt: `Task: ${task}\nSolution: ${solution}`,
  });
  return JSON.parse(r.text);
}

// 5 candidates → verifier 가 best 선택
const candidates = await Promise.all([...]);
const scores = await Promise.all(candidates.map(c => verify(c, task)));
const best = candidates[indexOfMax(scores.map(s => s.score))];

Reward model (RLHF 식)

# Trained model 가 score
reward_model = load_model('reward')

candidates = [llm(task) for _ in range(8)]
scores = reward_model.predict(candidates)
best = candidates[scores.argmax()]

→ OpenAI o1 / DeepSeek R1 가 RL 식.

Test execution (code)

async function withTest(prompt: string) {
  for (let i = 0; i < 3; i++) {
    const code = await llm.complete({ prompt });
    
    const result = await runInSandbox(code);
    if (result.ok) return code;
    
    prompt = `${prompt}\n\nPrevious attempt:\n${code}\n\nError: ${result.error}\nFix it.`;
  }
}

→ Test 가 ground truth. LLM critique 보다 강.

LLM-as-judge

async function judge(answerA: string, answerB: string, task: string) {
  const r = await llm.complete({
    system: 'Compare. Return "A", "B", or "tie".',
    prompt: `Task: ${task}\nA: ${answerA}\nB: ${answerB}`,
  });
  return r.text;
}

→ Eval framework 가 사용.

Self-refine (iter improve)

Loop:
  Generate → Feedback → Refine

Stop:
- N round
- "OK" feedback
- Quality plateau
async function selfRefine(task: string) {
  let answer = await llm.complete({ prompt: task });
  let prevQuality = -1;
  
  while (true) {
    const feedback = await llm.complete({
      prompt: `Task: ${task}\nAnswer: ${answer}\n\nFeedback (concrete improvements):`,
    });
    
    const refined = await llm.complete({
      prompt: `Task: ${task}\nAnswer: ${answer}\nFeedback: ${feedback}\n\nRefined:`,
    });
    
    const quality = await rate(refined);
    if (quality <= prevQuality) break;  // plateau
    
    answer = refined;
    prevQuality = quality;
  }
  
  return answer;
}

CRITIC (with tool)

# Self-critique 가 fact verify 와 결합
def critique_with_tool(answer):
    claims = extract_claims(answer)
    for c in claims:
        result = web_search(c)
        if not consistent(c, result):
            return False, c
    return True, None

→ "내가 한 답 가 Internet 와 다름" 감지.

Confidence calibration

async function answerWithConfidence(q: string) {
  const r = await llm.complete({
    system: 'Answer + 0-1 confidence (calibrated).',
    prompt: q,
  });
  return JSON.parse(r.text);
  // { answer: '...', confidence: 0.7 }
}

// Low confidence → search / ask user
if (r.confidence < 0.5) {
  // Search / clarify
}

→ 자기 모름 = visible.

Cost

1 shot:        $0.05
Critique:      $0.10 (2x)
Reflexion 3 round: $0.30 (6x)
Self-consistency 5: $0.25 (5x)
ToT 3-deep:    $0.50+ (10x)

→ Quality 가 cost 정당화 ?

Latency

Sequential (critique + revise): 매 round 가 2 LLM = 5 sec.
Parallel (self-consistency): 1 round.

→ User-facing = parallel.

When 사용?

✓ Math / 정확 task
✓ Code generation
✓ Research / multi-step
✓ Critical (legal, medical)
✗ Casual chat
✗ Speed-sensitive
✗ Cost-sensitive

Evaluation

# Reflection 가 정말 도움?
without = [llm(q) for q in tasks]
with_ = [reflection_loop(q) for q in tasks]

acc_without = evaluate(without)
acc_with = evaluate(with_)
print(f'Improvement: {acc_with - acc_without:.2%}')

함정

- Critique 가 hallucinate (가짜 issue)
- "OK" 빨리 끝남 (sycophancy)
- 무한 loop (matter 없는 변경)
- Cost 폭발
- Slower

OpenAI o1 / DeepSeek R1

RL trained 가 chain-of-thought.
Auto reflection / verify.

→ Single LLM call 안에서 자체 reflection.

Anthropic 의 권장

"Strong base model + simple prompt > complex reflection on weak model".

→ Reflection 의 가치 가 model size 에 비례 작아질 수.

Multi-step verify

Q: "이 medical claim 가 정확?"

Step 1: Identify claims.
Step 2: Search recent literature.
Step 3: Check consistency.
Step 4: Verdict + citation.

→ 명시적 step 가 implicit 보다 좋음.

🤔 의사결정 기준

상황 추천
정확 critical Reflection + verifier
Math / code Test execution
Research CoT + critique
User-facing chat 1 shot + retry on flag
Voting Self-consistency
Search / decompose ToT
Modern LLM (o1/R1) 자체 reflection

안티패턴

  • Reflection 가 모든 task: cost.
  • 무한 loop: stop 조건 없음.
  • Critique 신뢰 100%: critic 도 fallible.
  • Cost / latency 무시: prod 못 씀.
  • Eval 없음: improvement 모름.
  • 약한 model + reflection: 강 모델 1 shot 나음.

🤖 LLM 활용 힌트

  • Reflection = quality ↑ + cost ↑.
  • Test (code) > LLM critique.
  • Self-consistency 가 voting 의 답.
  • 강 model (o1) 가 자체 reflection.

🔗 관련 문서