--- id: wiki-2026-0508-iterative-prompting title: Iterative Prompting category: 10_Wiki/Topics status: verified canonical_id: self aliases: [iterative prompting, refinement, self-refine, chain-of-verification, agent loop] duplicate_of: none source_trust_level: A confidence_score: 0.92 verification_status: applied tags: [llm, prompt-engineering, iterative, self-refine, cove, agent] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: LangChain / OpenAI / Anthropic --- # Iterative Prompting ## 매 한 줄 > **"매 single prompt 의 X — 매 의 의 의 multiple round 의 의 의 quality ↑"**. 매 self-refine, CoVe (Chain-of-Verification), self-consistency, debate, agent loop. 매 modern: 매 reasoning model (o1, R1) 의 의 의 의 internal iteration. 매 trade-off: cost ↑, latency ↑. ## 매 핵심 ### 매 patterns - **Self-Refine** (Madaan 2023): 매 generate → critique → refine. - **CoVe** (Dhuliawala 2023): 매 plan verification → check → refine. - **Self-Consistency** (Wang 2022): 매 multiple sample → majority. - **Debate** (Du 2023): 매 multi-LLM argue. - **ReAct** (Yao 2022): 매 reason + act loop. - **Reflexion** (Shinn 2023): 매 RL-style with verbal feedback. ### 매 응용 1. Math reasoning. 2. Code generation. 3. Long writing. 4. Fact verification. 5. Agent task. ## 💻 패턴 ### Self-refine (Madaan) ```python def self_refine(prompt, llm, max_iter=3): output = llm.generate(prompt) for _ in range(max_iter): critique = llm.generate(f"""Critique this output: {output} List specific issues. If perfect, say 'DONE'.""") if 'DONE' in critique: return output output = llm.generate(f"""Original: {prompt} Previous output: {output} Critique: {critique} Improved output:""") return output ``` ### Chain-of-Verification (CoVe) ```python def cove(question, llm): # 매 1. Initial answer initial = llm.generate(f'Answer: {question}') # 매 2. Plan verification questions verify_qs = llm.generate(f"""Generate verification questions for: {initial} List as numbered questions.""").split('\n') # 매 3. Answer each independently (avoid bias) verifications = [llm.generate(f'Answer: {q}') for q in verify_qs if q.strip()] # 매 4. Refine return llm.generate(f"""Original: {initial} Verifications: {verifications} Refined answer accounting for any inconsistencies:""") ``` ### Self-consistency ```python def self_consistency(question, llm, n=10): """매 sample N times, majority vote.""" answers = [llm.generate(question, temperature=0.7) for _ in range(n)] extracted = [extract_answer(a) for a in answers] from collections import Counter return Counter(extracted).most_common(1)[0][0] ``` ### Multi-agent debate ```python def debate(question, agents, n_rounds=3): answers = {a.name: a.generate(question) for a in agents} for r in range(n_rounds): for a in agents: others = '\n'.join(f'{n}: {ans}' for n, ans in answers.items() if n != a.name) answers[a.name] = a.generate(f"""Question: {question} Other agents: {others} Refine your answer (or stick if confident):""") return answers ``` ### ReAct (reason + act) ```python def react(task, llm, tools): history = [f'Task: {task}'] for _ in range(10): thought = llm.generate('\n'.join(history) + '\nThought:') history.append(f'Thought: {thought}') if 'final answer' in thought.lower(): return thought action = llm.generate('\n'.join(history) + '\nAction:') observation = execute(action, tools) history.append(f'Action: {action}\nObservation: {observation}') ``` ### Reflexion (verbal RL) ```python def reflexion(task, llm, tools, max_attempts=5): memory = [] for attempt in range(max_attempts): result = react_with_memory(task, llm, tools, memory) success = evaluate(result) if success: return result # 매 reflect reflection = llm.generate(f"""Task: {task} Attempt {attempt}: {result} What went wrong? What should I try differently?""") memory.append(reflection) return result ``` ### Iterative refinement (writing) ```python def iterative_writing(topic, llm, draft_iterations=3): draft = llm.generate(f'Outline: {topic}') for _ in range(draft_iterations): feedback = llm.generate(f"""Critique: - Clarity (1-10) - Argument strength (1-10) - Specific issues {draft}""") draft = llm.generate(f"""Revise based on feedback: {feedback} Original: {draft}""") return draft ``` ### Self-correction (math) ```python def self_correct_math(problem, llm): answer = llm.generate(f'Solve step by step: {problem}') # 매 verify check = llm.generate(f"""Check this solution: {answer} Verify each step. If error, point it out.""") if 'error' in check.lower(): answer = llm.generate(f"""Original: {answer} Check: {check} Corrected solution:""") return answer ``` ### Best-of-N + judge ```python def best_of_n(prompt, llm, judge_llm, n=8): candidates = [llm.generate(prompt, temperature=0.7) for _ in range(n)] judge_prompt = f"""Pick best answer. Candidates: {format_candidates(candidates)} Output just the index.""" best_idx = int(judge_llm.generate(judge_prompt)) return candidates[best_idx] ``` ### Tree-of-Thoughts (ToT) ```python def tree_of_thoughts(problem, llm, branching=3, depth=4): """매 매 step 의 의 의 multiple thoughts → 매 best.""" paths = [[]] for _ in range(depth): new_paths = [] for path in paths: thoughts = [llm.generate(f'{problem}\nPath: {path}\nNext thought:') for _ in range(branching)] for t in thoughts: new_paths.append(path + [t]) # 매 score + prune scored = [(score(p, problem, llm), p) for p in new_paths] paths = [p for _, p in sorted(scored, key=lambda x: -x[0])[:branching]] return paths[0] ``` ### Cost monitoring ```python def cost_aware_iterate(prompt, llm, budget_tokens): used = 0 output = llm.generate(prompt) used += llm.last_usage.total_tokens while used < budget_tokens: critique = llm.generate(f'Critique: {output}') used += llm.last_usage.total_tokens if 'DONE' in critique or used > budget_tokens * 0.9: return output output = llm.generate(f'Refine: {output} {critique}') used += llm.last_usage.total_tokens return output ``` ### Stop criterion (auto) ```python def auto_stop_iterate(prompt, llm, max_iter=5): prev = llm.generate(prompt) for _ in range(max_iter): new = llm.generate(f'Improve: {prev}') if similarity(prev, new) > 0.95: return new # 매 converged prev = new return prev ``` ## 매 결정 기준 | 상황 | Pattern | |---|---| | Math / reasoning | Self-consistency / CoVe | | Code | Self-refine + execute check | | Writing | Iterative refinement | | Open-ended | Best-of-N + judge | | Agent task | ReAct / Reflexion | | Complex search | Tree-of-Thoughts | **기본값**: 매 reasoning = self-consistency (cheap) + CoVe (verify). 매 agent = ReAct + Reflexion. 매 cost-aware budget cap. ## 🔗 Graph - 부모: [[Prompt_Engineering|Prompt-Engineering]] - 변형: [[Self-Refine]] · [[Chain-of-Verification]] · [[ReAct]] · [[Reflexion]] - 응용: [[Hallucination-in-LLMs]] · [[GRPO]] · [[Foundation-Models]] - Adjacent: [[Best-of-N_Sampling]] · [[Self-Consistency]] ## 🤖 LLM 활용 **언제**: 매 reasoning. 매 high-stakes. 매 agent. **언제 X**: 매 simple completion (cost waste). ## ❌ 안티패턴 - **No stop criterion**: 매 infinite loop. - **No cost budget**: 매 bill shock. - **Same prompt every iter**: 매 no progress. - **No diverse sampling** (self-consistency): 매 same answer. - **Skip judge**: 매 best-of-N 매 useless. ## 🧪 검증 / 중복 - Verified (Madaan Self-Refine 2023, Dhuliawala CoVe 2023, Wang Self-Consistency 2022, Yao ReAct/ToT, Shinn Reflexion 2023). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — patterns + 매 self-refine / CoVe / ReAct / Reflexion / ToT code |