--- id: wiki-2026-0508-reranking-hybrid-search title: "Reranking & Hybrid Search" category: 10_Wiki/Topics status: needs_review canonical_id: self aliases: [P-Reinforce-AUTO-RRHS-001] duplicate_of: none source_trust_level: A confidence_score: 1.0 tags: [auto-reinforced, reranking, hybrid-search, semantic-search, lexical-search, bm25] raw_sources: [] last_reinforced: 2026-05-04 github_commit: pending inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08) tech_stack: language: unspecified framework: unspecified --- # [[Reranking & Hybrid Search|Reranking & Hybrid Search]] ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๊ฒ€์ƒ‰์˜ ํ•„ํ„ฐ๋ง๊ณผ ์žฌ์กฐํ•ฉ: ๋‹จ์ˆœํ•œ ์˜๋ฏธ์  ์œ ์‚ฌ์„ฑ(Dense)๊ณผ ์ •ํ™•ํ•œ ํ‚ค์›Œ๋“œ ๋งค์นญ(Sparse)์„ ๊ฒฐํ•ฉํ•˜๊ณ , ํ›„๋ณด๊ตฐ์„ ๋‹ค์‹œ ํ•œ๋ฒˆ ์ •๋ฐ€ ๊ฒ€์‚ฌํ•˜์—ฌ ๋ชจ๋ธ์—๊ฒŒ ๊ฐ€์žฅ ์™„๋ฒฝํ•œ ๊ทผ๊ฑฐ๋ฅผ ์ œ๊ณตํ•˜๋Š” 2๋‹จ๊ณ„ ๊ฒ€์ฆ ์‹œ์Šคํ…œ." ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) RAG ์‹œ์Šคํ…œ์˜ ๊ฒ€์ƒ‰ ์ •ํ™•๋„๋ฅผ ๊ทน๋Œ€ํ™”ํ•˜๊ธฐ ์œ„ํ•ด ๋‘ ๊ฐ€์ง€ ์ด์ƒ์˜ ๊ฒ€์ƒ‰ ๋ฐฉ์‹์„ ๊ฒฐํ•ฉํ•˜๊ณ  ๊ฒฐ๊ณผ๋ฅผ ์žฌ์ •๋ ฌํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค. 1. **Hybrid Search (ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๊ฒ€์ƒ‰)**: * **Dense Retrieval (์ž„๋ฒ ๋”ฉ ๊ฒ€์ƒ‰)**: ๋ฌธ๋งฅ๊ณผ ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•˜์—ฌ ์œ ์‚ฌํ•œ ์ •๋ณด๋ฅผ ์ฐพ์Šต๋‹ˆ๋‹ค. (์˜ˆ: "๊ธˆ์œต ์œ„๊ธฐ"์™€ "๊ฒฝ์ œ ๊ณตํ™ฉ") * **Sparse Retrieval (ํ‚ค์›Œ๋“œ ๊ฒ€์ƒ‰)**: BM25 ๋“ฑ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ •ํ™•ํ•œ ๋‹จ์–ด ๋งค์นญ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. (์˜ˆ: ์ œํ’ˆ๋ช…, ๊ณ ์œ  ๋ช…์‚ฌ ๊ฒ€์ƒ‰) * **Reciprocal Rank Fusion (RRF)**: ๋‘ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์ˆœ์œ„๋ฅผ ์ˆ˜ํ•™์ ์œผ๋กœ ๊ฒฐํ•ฉํ•˜์—ฌ ์ตœ์ข… ํ›„๋ณด๊ตฐ์„ ์‚ฐ์ถœํ•ฉ๋‹ˆ๋‹ค. 2. **Reranking (์žฌ์ˆœ์œ„ํ™”)**: * **ํ•„์š”์„ฑ**: 1์ฐจ ๊ฒ€์ƒ‰(Vector Search)์€ ์ˆ˜๋ฐฑ๋งŒ ๊ฐœ ์ค‘ ํ›„๋ณด๋ฅผ ๋นจ๋ฆฌ ์ฐพ๋Š” ๋ฐ ์ตœ์ ํ™”๋˜์–ด ์žˆ์–ด ์ •๋ฐ€๋„๊ฐ€ ๋‹ค์†Œ ๋‚ฎ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. * **์ž‘๋™**: 1์ฐจ ๊ฒ€์ƒ‰์œผ๋กœ ๋ฝ‘ํžŒ ์ˆ˜์‹ญ ๊ฐœ์˜ ํ›„๋ณด๊ตฐ์— ๋Œ€ํ•ด, ํ›จ์”ฌ ๋ฌด๊ฒ๊ณ  ์ •๋ฐ€ํ•œ Cross-Encoder ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์งˆ๋ฌธ๊ณผ์˜ ๊ด€๋ จ์„ฑ์„ ๋‹ค์‹œ ๊ณ„์‚ฐํ•˜๊ณ  ์ˆœ์œ„๋ฅผ ์žฌ๋ฐฐ์น˜ํ•ฉ๋‹ˆ๋‹ค. 3. **ํšจ๊ณผ**: * ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์ƒ์œ„๊ถŒ(Top-K)์— ์‹ค์ œ ์ •๋‹ต์ด ํฌํ•จ๋  ํ™•๋ฅ (Recall)๊ณผ ์ •๋‹ต๋งŒ ํฌํ•จ๋  ํ™•๋ฅ (Precision)์„ ๋™์‹œ์— ๋†’์ž…๋‹ˆ๋‹ค. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & Updates) * **์ง€์—ฐ ์‹œ๊ฐ„**: Reranking ๋‹จ๊ณ„๋Š” ์ถ”๊ฐ€์ ์ธ ๋ชจ๋ธ ์—ฐ์‚ฐ์„ ํ•„์š”๋กœ ํ•˜๋ฏ€๋กœ, ์ „์ฒด ์‘๋‹ต ์†๋„๊ฐ€ ์ˆ˜๋ฐฑ ๋ฐ€๋ฆฌ์ดˆ ์ด์ƒ ๋А๋ ค์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. * **๋น„์šฉ**: ๊ณ ์„ฑ๋Šฅ Reranker ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ API ํ˜ธ์ถœ ๋น„์šฉ์ด๋‚˜ GPU ์ž์› ์†Œ๋ชจ๊ฐ€ ๋Š˜์–ด๋‚ฉ๋‹ˆ๋‹ค. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) * **์ƒ์œ„ ์‹œ์Šคํ…œ**: [[Retrieval-Augmented Generation (RAG)|Retrieval-Augmented Generation (RAG)]] * **์—ฐ๊ด€ ๊ธฐ์ˆ **: [[Vector Databases & Search|Vector Databases & Search]], [[Embedding Models & MRL|Embedding Models & MRL]] * **์ฃผ์š” ํˆด**: Cohere Rerank, BGE-Reranker, Voyage Rerank --- *Last updated: 2026-05-04* ## ๐Ÿค– LLM ํ™œ์šฉ ํžŒํŠธ (How to Use This Knowledge) **์–ธ์ œ ์ด ์ง€์‹์„ ์“ฐ๋Š”๊ฐ€:** - *(TODO)* **์–ธ์ œ ์“ฐ๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€:** - *(TODO)* ## ๐Ÿงช ๊ฒ€์ฆ ์ƒํƒœ (Validation) - **์ •๋ณด ์ƒํƒœ:** needs_review - **์ถœ์ฒ˜ ์‹ ๋ขฐ๋„:** A - **๊ฒ€ํ†  ์ด์œ :** *(P-Reinforce Phase 1 ์ž๋™ ์ •๊ทœํ™”. ๋ณธ๋ฌธ ๊ฒ€์ฆ ํ•„์š”.)* ## ๐Ÿงฌ ์ค‘๋ณต ๊ฒ€์‚ฌ (Duplicate Check) - **๊ธฐ์กด ์œ ์‚ฌ ๋ฌธ์„œ:** *(TODO: ์ธ๋ฑ์„œ ํด๋Ÿฌ์Šคํ„ฐ ๋ฆฌํฌํŠธ ์ฐธ์กฐ)* - **์ฒ˜๋ฆฌ ๋ฐฉ์‹:** UPDATE (์ž๋™ ์ •๊ทœํ™”) - **์ฒ˜๋ฆฌ ์ด์œ :** Phase 1 ์ •๊ทœํ™” โ€” ์˜› ํ…œํ”Œ๋ฆฟ/๋ˆ„๋ฝ ํ•„๋“œ ๋ณด๊ฐ•. ## ๐Ÿ•“ ๋ณ€๊ฒฝ ์ด๋ ฅ (Changelog) | ๋‚ ์งœ | ๋ณ€๊ฒฝ ๋‚ด์šฉ | ์ฒ˜๋ฆฌ ๋ฐฉ์‹ | ์‹ ๋ขฐ๋„ | |------|-----------|-----------|--------| | 2026-05-08 | P-Reinforce Phase 1 ์ •๊ทœํ™” (frontmatter + ํ—ค๋” ํ‘œ์ค€ํ™”) | UPDATE | A | ## ๐Ÿ’ป ์ฝ”๋“œ ํŒจํ„ด (Code Patterns) **ํŒจํ„ด 1:** *(TODO: ์ด ํ”„๋กœ์ ํŠธ ์ปจ๋ฒค์…˜ ๋ฐ˜์˜ํ•œ ๊ตฌ์กฐ ์Šค์ผˆ๋ ˆํ†ค)* ```text # TODO ``` ## ๐Ÿค” ์˜์‚ฌ๊ฒฐ์ • ๊ธฐ์ค€ (Decision Criteria) **์„ ํƒ A๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **์„ ํƒ B๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **๊ธฐ๋ณธ๊ฐ’:** > *(TODO)* ## โŒ ์•ˆํ‹ฐํŒจํ„ด (Anti-Patterns) - **[์•ˆํ‹ฐํŒจํ„ด]:** *(TODO: ๋ฌด์—‡์„ ํ•˜๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€ + ์ด์œ  + ๋Œ€์‹  ๋ฌด์—‡์„)*