--- id: wiki-2026-0508-llamaindex title: LlamaIndex category: 10_Wiki/Topics status: verified canonical_id: self aliases: [GPT Index, LlamaIndex Framework] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [rag, llamaindex, retrieval, indexing, agents] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: { language: python/ts, framework: llama-index } --- # LlamaIndex ## 매 한 줄 > **"매 LangChain 이 chain, LlamaIndex 는 index"**. Data → Index → Query Engine, RAG 에 특화된 framework. ## 매 핵심 ### 매 핵심 추상 - **Documents / Nodes**: 원본 → chunked nodes - **Index**: VectorStoreIndex, SummaryIndex, KnowledgeGraphIndex, TreeIndex - **Query Engine**: retriever + synthesizer + (optional) postprocessor - **Agents**: ReAct, OpenAI tool, function calling - **Workflows**: event-driven multi-step (LangGraph 대응) ### 매 vs LangChain | | LlamaIndex | LangChain | |---|---|---| | 강점 | RAG 데이터 indexing | Agent / chain orchestration | | 추상 | Index 중심 | Chain/Runnable 중심 | | Eval | LlamaIndex Eval (faithfulness, relevancy) | LangSmith | | 권장 | RAG 헤비 | 다양한 tool/agent | ### 매 응용 1. 사내 docs Q&A 2. Code RAG (repo 전체 indexing) 3. Multi-doc summarization 4. Knowledge graph + RAG hybrid 5. Agentic RAG (자가 query 재작성) ## 💻 패턴 ### Pattern 1: Vector index basic ```python from llama_index.core import VectorStoreIndex, SimpleDirectoryReader from llama_index.llms.anthropic import Anthropic docs = SimpleDirectoryReader("./data").load_data() index = VectorStoreIndex.from_documents(docs) qe = index.as_query_engine(llm=Anthropic(model="claude-opus-4-7"), similarity_top_k=5) print(qe.query("회사 휴가 정책 요약")) ``` ### Pattern 2: Persistent vector store (Chroma) ```python import chromadb from llama_index.vector_stores.chroma import ChromaVectorStore from llama_index.core import StorageContext client = chromadb.PersistentClient(path="./chroma") collection = client.get_or_create_collection("docs") vs = ChromaVectorStore(chroma_collection=collection) storage = StorageContext.from_defaults(vector_store=vs) index = VectorStoreIndex.from_documents(docs, storage_context=storage) ``` ### Pattern 3: Hybrid retrieval (vector + BM25) ```python from llama_index.retrievers.bm25 import BM25Retriever from llama_index.core.retrievers import QueryFusionRetriever vec_r = index.as_retriever(similarity_top_k=5) bm25_r = BM25Retriever.from_defaults(nodes=index.docstore.docs.values(), similarity_top_k=5) fusion = QueryFusionRetriever([vec_r, bm25_r], num_queries=1, mode="reciprocal_rerank") ``` ### Pattern 4: Re-ranker (Cohere) ```python from llama_index.postprocessor.cohere_rerank import CohereRerank reranker = CohereRerank(top_n=3) qe = index.as_query_engine(node_postprocessors=[reranker], similarity_top_k=20) # 20 후보 → rerank → top 3 ``` ### Pattern 5: Sub-question for multi-doc ```python from llama_index.core.query_engine import SubQuestionQueryEngine from llama_index.core.tools import QueryEngineTool tools = [ QueryEngineTool.from_defaults(query_engine=qe_a, name="finance"), QueryEngineTool.from_defaults(query_engine=qe_b, name="hr"), ] sub = SubQuestionQueryEngine.from_defaults(query_engine_tools=tools) sub.query("작년 인건비 대비 헤드카운트 변화는?") ``` ### Pattern 6: Eval (faithfulness) ```python from llama_index.core.evaluation import FaithfulnessEvaluator ev = FaithfulnessEvaluator(llm=Anthropic(model="claude-opus-4-7")) resp = qe.query("X 가 무엇?") result = ev.evaluate_response(response=resp) assert result.passing # answer grounded in retrieved context? ``` ### Pattern 7: Agent with tools ```python from llama_index.core.agent import ReActAgent agent = ReActAgent.from_tools([search_tool, calc_tool], llm=Anthropic(...)) agent.chat("작년 매출 대비 올해 성장률?") ``` ### Pattern 8: Workflow (event-driven) ```python from llama_index.core.workflow import Workflow, step, Event class RetrieveEvent(Event): query: str class GenEvent(Event): nodes: list class RagFlow(Workflow): @step async def retrieve(self, ev: RetrieveEvent) -> GenEvent: return GenEvent(nodes=retriever.retrieve(ev.query)) @step async def generate(self, ev: GenEvent): return synthesize(ev.nodes) ``` ## 매 결정 기준 | 상황 | Tool | |---|---| | RAG 가 메인 | LlamaIndex | | 복잡한 agent / tool 오케스트레이션 | LangChain / LangGraph | | Production simple RAG | LlamaIndex + Chroma/Qdrant | | Multi-doc 합성 | SubQuestionQueryEngine | | 정확도 push | hybrid + reranker | **기본값**: VectorStoreIndex + Chroma + Cohere rerank + faithfulness eval. ## 🔗 Graph - 부모: [[RAG]] - 변형: [[LangChain]] - 응용: [[Embedding]] - Adjacent: [[LLM_Ops_and_Tuning]], [[Prompt_Engineering]] ## 🤖 LLM 활용 **언제**: doc Q&A, code RAG, multi-doc summarization. **언제 X**: 단일 prompt 로 충분 (RAG overkill), real-time chat 만 필요 (index 비용). ## ❌ 안티패턴 - Chunk size 무조건 default → recall 저하 - Re-rank 안 함 → 상위 k 노이즈 - Eval 없이 prod → silent quality drop - 모든 doc 한 index → namespace 분리 안하면 권한/품질 혼탁 - VectorStoreIndex 만 사용, BM25 안 섞음 → keyword query 약함 ## 🧪 검증 / 중복 - Verified (LlamaIndex docs, ChromaDB, Cohere rerank). 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — vs LangChain, hybrid+rerank+eval patterns |