Files
2nd/10_Wiki/Topics/AI_and_ML/Refinement.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

180 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-refinement
title: Refinement
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Iterative-Refinement, Self-Refine, Type-Refinement]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [llm, iteration, types, design]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: python
framework: anthropic-sdk
---
# Refinement
## 매 한 줄
> **"매 첫 시도는 draft, 매 refinement는 product"**. Refinement는 초안을 critique → revise loop로 다듬어 quality를 끌어올리는 patterns의 family — LLM self-refine, type narrowing, design iteration 모두 매 같은 핵심 idea를 공유.
## 매 핵심
### 매 LLM Self-Refine (2024 Madaan et al. → 2026 mainstream)
- **Generate → Critique → Refine** loop. Single model이 매 세 role을 모두 수행.
- 매 효과: math, code, dialog 매 +520% accuracy without extra training.
- **Reflexion** (Shinn 2023)은 verbal RL — 매 episode 끝에 매 self-critique를 episodic memory로 저장.
- 2026 standard: Claude Opus 4.7 / GPT-5 매 native "extended thinking" mode가 매 refine을 internal로 흡수 → external loop는 매 high-stakes (legal, medical) 에서만.
### 매 Type Refinement (TypeScript / Flow / Python typing)
- **Narrowing**: union type 의 매 instance를 specific subtype 으로 좁힘 (`typeof`, `instanceof`, discriminated union).
- **Refinement type**: predicate-attached type — `{x: number | x > 0}`. Liquid Haskell, F* 매 사용.
- 2026 TS 5.x: `satisfies` operator + control-flow analysis 매 강력 — 매 manual cast 의 거의 elimination.
### 매 Design / Spec Refinement
- **Stepwise refinement** (Wirth 1971) — abstract spec → concrete implementation을 매 단계적으로.
- **BDD** (Given-When-Then) 매 modern incarnation.
- AI-aided: spec → Claude → multiple impl candidates → human picks → refine.
### 매 응용
1. RAG answer 의 self-refine으로 hallucination ↓.
2. Code generation 매 compile error → refine loop.
3. TS API 의 progressive type narrowing.
4. Product spec 의 PM ↔ AI iterative tightening.
## 💻 패턴
### Self-Refine loop (Anthropic SDK)
```python
from anthropic import Anthropic
client = Anthropic()
MODEL = "claude-opus-4-7"
def self_refine(task: str, max_iter: int = 3) -> str:
answer = client.messages.create(
model=MODEL, max_tokens=2048,
messages=[{"role": "user", "content": task}],
).content[0].text
for i in range(max_iter):
critique = client.messages.create(
model=MODEL, max_tokens=1024,
system="You are a strict critic. List concrete flaws or reply 'NO_ISSUES'.",
messages=[{"role": "user", "content": f"Task: {task}\n\nDraft:\n{answer}"}],
).content[0].text
if "NO_ISSUES" in critique:
return answer
answer = client.messages.create(
model=MODEL, max_tokens=2048,
messages=[{"role": "user",
"content": f"Task: {task}\nDraft: {answer}\nCritique: {critique}\nRevise."}],
).content[0].text
return answer
```
### Reflexion-style episodic memory
```python
class Reflexion:
def __init__(self):
self.memory: list[str] = [] # accumulated lessons
def step(self, task: str) -> str:
ctx = "\n".join(f"- {m}" for m in self.memory[-5:])
attempt = llm(f"Task: {task}\nPast lessons:\n{ctx}\nAct.")
feedback = environment(attempt)
if not feedback.success:
lesson = llm(f"Why did this fail? Task: {task}\nAttempt: {attempt}\nFeedback: {feedback}")
self.memory.append(lesson)
return attempt
```
### TypeScript discriminated union narrowing
```typescript
type Result<T> =
| { kind: 'ok'; value: T }
| { kind: 'err'; error: Error };
function unwrap<T>(r: Result<T>): T {
if (r.kind === 'err') throw r.error; // narrow → 'ok' branch
return r.value; // typed as T, no cast
}
```
### Python TypeGuard refinement
```python
from typing import TypeGuard
def is_str_list(x: list[object]) -> TypeGuard[list[str]]:
return all(isinstance(i, str) for i in x)
def join(items: list[object]) -> str:
if is_str_list(items):
return ", ".join(items) # narrowed
raise TypeError("not str list")
```
### Compile-error refine loop (code generation)
```python
def codegen_refine(spec: str, max_iter=5):
code = llm(f"Write Python for: {spec}")
for _ in range(max_iter):
ok, err = run_pytest(code)
if ok: return code
code = llm(f"Spec: {spec}\nCode: {code}\nFailing: {err}\nFix.")
raise RuntimeError("refinement budget exhausted")
```
### Best-of-N + judge (ensemble refinement)
```python
def best_of_n(prompt: str, n: int = 5) -> str:
candidates = [llm(prompt, temperature=1.0) for _ in range(n)]
ranking = llm(f"Pick best of these:\n{candidates}\nReturn index 0..{n-1}")
return candidates[int(ranking.strip())]
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| Single-shot good enough (chat) | No refine — extra cost |
| High-stakes (legal/medical) | External self-refine + human review |
| Code with tests | Compile/test-driven refine |
| Long agentic task | Reflexion (episodic memory) |
| Reasoning math | extended thinking (native) — already refine internally |
| TS API design | Narrowing + `satisfies` |
**기본값**: native extended thinking 우선 → 부족하면 external self-refine 12 iter.
## 🔗 Graph
- 부모: [[Iteration]]
- 변형: [[Reflexion]] · [[Self-Consistency]] · [[Best-of-N]]
- 응용: [[RAG]] · [[Code-Generation]]
- Adjacent: [[TypeScript 타입 시스템 (TypeScript Type System)|Type-System]] · [[Stepwise-Refinement]] · [[BDD]]
## 🤖 LLM 활용
**언제**: high-stakes output, agentic loops, code with verifiable feedback.
**언제 X**: 매 latency-sensitive UX, 매 simple chat — extra latency × cost는 매 안 맞음.
## ❌ 안티패턴
- **Infinite refine loop**: max_iter 의 hard cap 의 X → cost explosion.
- **Same-model critique only**: 매 critic = generator인 경우 같은 blind spot. Mix models (Opus critic, Sonnet generator).
- **Refine without termination signal**: "NO_ISSUES" 같은 매 explicit stop 의 부재 → 매 endless tweaking.
- **Type assertion 으로 narrow**: TS 매 `as` 사용은 매 refinement 의 X — 매 unsafe cast.
## 🧪 검증 / 중복
- Verified (Madaan 2024 Self-Refine, Shinn 2023 Reflexion, TS handbook narrowing).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — full rewrite covering LLM self-refine, type narrowing, design iteration |