--- id: wiki-2026-0508-dynamic-few-shot title: Dynamic Few-Shot Selection category: 10_Wiki/Topics status: verified canonical_id: self aliases: [dynamic few-shot, in-context learning, ICL retrieval, RAG few-shot, kNN-prompting] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [prompt-engineering, few-shot, in-context-learning, rag, vector-search, llm, retrieval] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: LangChain / LlamaIndex / Faiss / Chroma --- # Dynamic Few-Shot ## 매 한 줄 > **"매 static example 의 X — 매 query-similar example 의 retrieve"**. 매 RAG-style example pool. 매 매 input 의 most relevant N example 의 inject. 매 modern: 매 hybrid (BM25 + dense) + 매 diversity rerank + 매 LLM-as-judge selection. ## 매 핵심 ### 매 motivation - **Static**: 매 매 prompt 의 같은 example. - **Dynamic**: 매 매 query 의 best match. - **Result**: 매 accuracy ↑ + 매 token 의 efficient. ### 매 selection strategy #### Similarity-based - 매 cosine on embedding. - 매 top-K nearest. #### Diversity (MMR) - 매 redundancy ↓. - 매 broader coverage. #### LLM-as-judge - 매 first retrieve N → 매 LLM 의 best K. - 매 expensive but high-quality. #### Skill / category-based - 매 query 의 type 의 classify → 매 type-specific example. #### Iterative refinement - 매 매 round 의 example 의 update based on output quality. ### 매 retrieval method - **Dense** (embedding): 매 semantic similarity. - **BM25 / TF-IDF**: 매 keyword. - **Hybrid**: 매 둘 다 의 fuse. - **Cross-encoder rerank**: 매 expensive but accurate. ### 매 응용 1. **NER / Classification**: 매 task-type-similar example. 2. **Code generation**: 매 similar API usage. 3. **Translation**: 매 domain-specific phrase. 4. **Reasoning**: 매 similar pattern (math). 5. **Customer service**: 매 similar past issue. 6. **Schema-aware Text2SQL**: 매 similar query pattern. ### 매 modern best practice 1. **Quality > quantity**: 매 3-5 example > 매 50. 2. **Diverse**: 매 same domain 의 cluster X. 3. **Recency**: 매 newer pattern. 4. **Format consistency**: 매 same template. 5. **Avoid leakage**: 매 test 의 example 의 X. ### 매 modern AI 의 evolution - **In-Context Learning**: 매 GPT-3 의 zero / few-shot 의 emergence. - **Long context**: 매 100K+ context 의 의 매 100s example. - **Many-shot ICL**: 매 1000+ example (Anthropic 2024). - **Adaptive ICL**: 매 매 query 의 optimal length. ## 💻 패턴 ### Basic dynamic few-shot (LangChain) ```python from langchain.vectorstores import Chroma from langchain.embeddings import OpenAIEmbeddings from langchain.prompts import FewShotPromptTemplate, PromptTemplate # 매 example pool examples = [ {'question': '...', 'answer': '...'}, # ... 100+ examples ] # 매 vector store vectordb = Chroma.from_texts( [f"{e['question']} {e['answer']}" for e in examples], embedding=OpenAIEmbeddings(), metadatas=examples, ) def dynamic_prompt(query, k=3): relevant = vectordb.similarity_search(query, k=k) selected = [doc.metadata for doc in relevant] example_prompt = PromptTemplate( input_variables=['question', 'answer'], template='Q: {question}\nA: {answer}', ) fp = FewShotPromptTemplate( examples=selected, example_prompt=example_prompt, prefix='Answer following the format below.\n\n', suffix='\n\nQ: {input}\nA:', input_variables=['input'], ) return fp.format(input=query) ``` ### MMR (diversity) ```python def mmr_select(query_emb, candidates, lambda_=0.7, k=5): """매 Maximal Marginal Relevance — 매 relevance + 매 diversity.""" selected = [] selected_embs = [] while candidates and len(selected) < k: scores = [] for c in candidates: relevance = cosine(query_emb, c['emb']) if not selected_embs: novelty = 0 else: max_sim = max(cosine(c['emb'], se) for se in selected_embs) novelty = max_sim mmr = lambda_ * relevance - (1 - lambda_) * novelty scores.append(mmr) best_idx = scores.index(max(scores)) selected.append(candidates[best_idx]) selected_embs.append(candidates[best_idx]['emb']) candidates = [c for i, c in enumerate(candidates) if i != best_idx] return selected ``` ### LLM-as-judge selection ```python def llm_judge_select(query, candidates, k=5): """매 first retrieve large pool → 매 LLM 의 best.""" # 매 1. retrieve top 20 pool = vectordb.similarity_search(query, k=20) # 매 2. LLM 의 select best 5 formatted = '\n\n'.join(f'[{i}] {p}' for i, p in enumerate(pool)) prompt = f"""Given the query: "{query}" Select the {k} MOST USEFUL examples for in-context learning. Consider: relevance, format, diversity, and pedagogical clarity. Examples: {formatted} Reply with ONLY the indices, comma-separated. e.g., 0, 3, 5, 7, 12""" indices = parse_indices(llm.generate(prompt)) return [pool[i] for i in indices] ``` ### Hybrid search (BM25 + dense) ```python from rank_bm25 import BM25Okapi import numpy as np class HybridRetriever: def __init__(self, examples): self.examples = examples self.bm25 = BM25Okapi([e['text'].split() for e in examples]) self.embeddings = embed_all([e['text'] for e in examples]) def search(self, query, k=10, alpha=0.5): # 매 BM25 bm25_scores = self.bm25.get_scores(query.split()) bm25_norm = bm25_scores / (bm25_scores.max() + 1e-6) # 매 dense q_emb = embed(query) dense_scores = cosine_similarity([q_emb], self.embeddings)[0] # 매 fuse scores = alpha * dense_scores + (1 - alpha) * bm25_norm top_k = scores.argsort()[-k:][::-1] return [self.examples[i] for i in top_k] ``` ### Cross-encoder rerank ```python from sentence_transformers import CrossEncoder reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2') def rerank(query, candidates, k=5): pairs = [[query, c['text']] for c in candidates] scores = reranker.predict(pairs) sorted_idx = scores.argsort()[-k:][::-1] return [candidates[i] for i in sorted_idx] ``` ### Skill-aware few-shot ```python def skill_aware_few_shot(query): skill = classify_skill(query) # 매 LLM classifier # 매 매 skill 의 specific pool skill_examples = examples_by_skill[skill] relevant = vector_search(query, skill_examples, k=3) return relevant ``` ### Token budget management ```python def fit_in_context(examples, max_tokens=4000, query_tokens=500): """매 context window 의 fit.""" available = max_tokens - query_tokens selected = [] used = 0 for ex in examples: # 매 already ranked ex_tokens = count_tokens(ex) if used + ex_tokens > available: break selected.append(ex) used += ex_tokens return selected ``` ### Long-context many-shot (modern) ```python def many_shot_icl(query, n_examples=100): """매 100+ example 의 long context (Anthropic 2024).""" # 매 simple: 매 just retrieve more relevant = vectordb.similarity_search(query, k=n_examples) # 매 quality > quantity rerank reranked = rerank(query, relevant, k=n_examples) return format_many_shot(reranked, query) ``` ### Iterative refinement ```python def iterative_few_shot(query, max_iter=3): examples = initial_select(query, k=5) for i in range(max_iter): result = llm.generate(format_prompt(examples, query)) critique = self_critique(result, query) if critique.is_satisfactory: return result # 매 critique 의 use 의 better example 의 retrieve examples = retrieve_for_weakness(query, critique, k=5) return result ``` ### Eval (offline) ```python def eval_few_shot_strategy(strategy, eval_set): correct = 0 for ex in eval_set: examples = strategy(ex['query']) # 매 LEAVE OUT current example prompt = format_prompt(examples, ex['query']) pred = llm.generate(prompt) if pred == ex['answer']: correct += 1 return correct / len(eval_set) ``` ## 매 결정 기준 | 상황 | Strategy | |---|---| | Diverse query | Vector + MMR | | High accuracy | LLM-as-judge select | | Real-time / cost | Vector top-K only | | Long context | Many-shot 100+ | | Skill variety | Classifier + skill-specific | | Critical | Hybrid + cross-encoder rerank | **기본값**: Hybrid retrieve + MMR + token budget. 매 critical = 매 cross-encoder rerank. ## 🔗 Graph - 부모: [[Prompt_Engineering|Prompt-Engineering]] · [[In-Context-Learning]] · [[RAG]] - 변형: [[kNN-Prompting]] - 응용: [[Faiss]] · [[BM25]] - Adjacent: [[Transformer_Architecture_and_LLM_Foundations|BERT]] · [[CLIP]] · [[Sentence-Transformers]] · [[Best-of-N_Sampling]] · [[Be-Detailed]] ## 🤖 LLM 활용 **언제**: 매 in-context learning. 매 RAG-augmented prompt. 매 task-specific accuracy boost. **언제 X**: 매 zero-shot capable task. 매 single template task. ## ❌ 안티패턴 - **No diversity**: 매 redundant similar example. - **Test data leakage**: 매 evaluation 의 inflate. - **Inconsistent format**: 매 confuse model. - **Always max examples**: 매 token waste. - **Static pool 의 stale**: 매 update 의 X. ## 🧪 검증 / 중복 - Verified (Liu 2022 What Makes Good In-Context Examples, Anthropic many-shot 2024). - 신뢰도 A. - Related: [[Transformer_Architecture_and_LLM_Foundations|BERT]] · [[Sentence-Transformers]] · [[Best-of-N_Sampling]] · [[Be-Detailed]] · [[ChatGPT_Emoticon_Prompt_Engineering]]. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — strategy + 매 LangChain / MMR / LLM-judge / hybrid / many-shot code |