--- id: wiki-2026-0508-information-retrieval title: Information Retrieval category: Computer_Science_and_Theory status: needs_review canonical_id: self aliases: [information_retrieval] duplicate_of: none source_trust_level: A confidence_score: 1.0 tags: [- information_retrieval - search_engine - ranking - ir_metrics - evaluation] raw_sources: ["- E:/Wiki/2nd/10_Wiki/Topics/AI_and_ML/Information-Retrieval-IR.md - E:/Wiki/2nd/10_Wiki/Topics/Computer_Science_and_Theory/Information Retrieval (IR).md - E:/Wiki/2nd/10_Wiki/Topics/Computer_Science_and_Theory/Information Retrieval Evaluation Metrics.md"] last_reinforced: 2026-05-08 github_commit: pending inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08) tech_stack: language: unspecified framework: unspecified --- # ์ •๋ณด ๊ฒ€์ƒ‰ (Information Retrieval, IR) ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๋ฐฉ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ์ง‘ํ•ฉ์—์„œ ์‚ฌ์šฉ์ž์˜ ํŠน์ • ์ •๋ณด ์š”๊ตฌ(Information Need)์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ๋ฌธ์„œ๋ฅผ ์ •ํ™•ํ•˜๊ณ  ๋น ๋ฅด๊ฒŒ ์ฐพ์•„๋‚ด์–ด ์ตœ์ ์˜ ์ˆœ์œ„๋กœ ์ •๋ ฌํ•˜๋Š” ์ปดํ“จํ„ฐ ๊ณผํ•™์˜ ํ•ต์‹ฌ ๊ธฐ์ˆ  ์ฒด๊ณ„." ## ๐Ÿ“– ํ•ต์‹ฌ ๊ฐœ๋… (Core Concept) ### 1. IR์˜ ํ•ต์‹ฌ ํ”„๋กœ์„ธ์Šค ์ •๋ณด ๊ฒ€์ƒ‰ ์‹œ์Šคํ…œ์€ ๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ(์ฃผ๋กœ ํ…์ŠคํŠธ)๋ฅผ ๋Œ€์ƒ์œผ๋กœ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋‹จ๊ณ„๋ฅผ ๊ฑฐ์นฉ๋‹ˆ๋‹ค. * **์ƒ‰์ธ (Indexing)**: ๊ฒ€์ƒ‰ ๋Œ€์ƒ ๋ฌธ์„œ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฐพ๊ธฐ ์œ„ํ•ด [[Inverted Index|์—ญ์ƒ‰์ธ(Inverted Index)]] ๋“ฑ์œผ๋กœ ๊ตฌ์กฐํ™”ํ•ฉ๋‹ˆ๋‹ค [18]. * **์งˆ์˜ ์ฒ˜๋ฆฌ (Query Processing)**: ์ž์—ฐ์–ด ์งˆ๋ฌธ์„ ํ† ํฐํ™”, ์Šคํ…Œ๋ฐ(Stemming) ๋“ฑ์„ ํ†ตํ•ด ์‹œ์Šคํ…œ์ด ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋Š” ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค [19]. * **๋žญํ‚น (Ranking)**: ๊ฒ€์ƒ‰๋œ ๋ฌธ์„œ๋“ค ์ค‘ ์‚ฌ์šฉ์ž์˜ ์˜๋„์™€ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ˆœ์„œ๋Œ€๋กœ ์ •๋ ฌํ•ฉ๋‹ˆ๋‹ค [20]. ### 2. ๊ฒ€์ƒ‰ ๋ชจ๋ธ์˜ ์ง„ํ™” * **Boolean Model**: ํ‚ค์›Œ๋“œ์˜ ์กด์žฌ ์œ ๋ฌด(AND, OR, NOT)๋งŒ ํŒ๋‹จํ•˜๋Š” ๊ธฐ์ดˆ์  ๋ชจ๋ธ [23]. * **Vector Space Model**: ๋ฌธ์„œ๋ฅผ ๋ฒกํ„ฐ ๊ณต๊ฐ„์˜ ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๊ณ  ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ๋“ฑ์œผ๋กœ ๊ด€๋ จ์„ฑ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค [24]. * **BM25 (Probabilistic Model)**: ๋ฌธ์„œ ๋‚ด ๋‹จ์–ด ๋นˆ๋„์™€ ๋ฌธ์„œ ๊ธธ์ด๋ฅผ ๊ณ ๋ คํ•œ ํ†ต๊ณ„์  ๋ชจ๋ธ๋กœ, ํ˜„๋Œ€ ๊ฒ€์ƒ‰ ์—”์ง„์˜ ๊ฐ•๋ ฅํ•œ Baseline์ž…๋‹ˆ๋‹ค [25]. * **Neural IR**: ๋”ฅ๋Ÿฌ๋‹๊ณผ ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์—ฌ ๋ฌธ๋งฅ์  ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•˜๋Š” ํ˜„๋Œ€์  ๋ฐฉ์‹(Semantic Search)์ž…๋‹ˆ๋‹ค [26]. ### 3. ํ‰๊ฐ€ ์ง€ํ‘œ (IR Metrics) ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์€ '์ •ํ™•๋„'์™€ '์ˆœ์œ„ ํ’ˆ์งˆ'๋กœ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค. * **์ง‘ํ•ฉ ๊ธฐ๋ฐ˜ ์ง€ํ‘œ**: [[Precision & Recall|Precision(์ •๋ฐ€๋„)]], [[Precision & Recall|Recall(์žฌํ˜„์œจ)]], F1-Score [18-20]. * **์ˆœ์œ„ ๊ธฐ๋ฐ˜ ์ง€ํ‘œ**: [[nDCG|nDCG]](์œ„์น˜ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ), [[MAP|MAP]](ํ‰๊ท  ์ •๋ฐ€๋„์˜ ํ‰๊ท ), [[ERR|ERR]](์‚ฌ์šฉ์ž ๋งŒ์กฑ๋„ ๊ธฐ๋ฐ˜) [23-25]. * **RAG ํŠนํ™” ์ง€ํ‘œ**: [[Context Precision & Recall|Context Precision]], [[Context Precision & Recall|Context Recall]] (RAG ์‹œ์Šคํ…œ์—์„œ ์ปจํ…์ŠคํŠธ์˜ ํ’ˆ์งˆ ์ธก์ •) [28, 29]. ## โš–๏ธ ํŠธ๋ ˆ์ด๋“œ์˜คํ”„ ๋ฐ ๊ณ ๋ ค์‚ฌํ•ญ (Trade-offs) * **Precision vs Recall**: ๋ชจ๋“  ๊ด€๋ จ ๋ฌธ์„œ๋ฅผ ์ฐพ์œผ๋ ค ํ•˜๋ฉด(Recallโ†‘) ๋…ธ์ด์ฆˆ๊ฐ€ ๋Š˜์–ด๋‚˜๊ณ (Precisionโ†“), ์ •ํ™•ํ•œ ๊ฒƒ๋งŒ ์ฐพ์œผ๋ ค ํ•˜๋ฉด(Precisionโ†‘) ์ค‘์š”ํ•œ ์ •๋ณด๋ฅผ ๋†“์น  ์ˆ˜(Recallโ†“) ์žˆ์Šต๋‹ˆ๋‹ค [33]. * **์†๋„ vs ํ’ˆ์งˆ**: ์ •๊ตํ•œ ๋žญํ‚น ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํ’ˆ์งˆ์„ ๋†’์ด์ง€๋งŒ ์‘๋‹ต ์‹œ๊ฐ„์„ ๋Šฆ์ถฅ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค๋‹จ๊ณ„ ๋žญํ‚น(Multi-stage Ranking) ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค [34]. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - **Concepts**: [[Natural Language Processing (NLP)|NLP]], [[Semantic Search|์˜๋ฏธ๋ก ์  ๊ฒ€์ƒ‰]], [[Hybrid Search|ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๊ฒ€์ƒ‰]] - **Evaluation**: [[LLM-as-judge]], [[A/B Testing]], [[Judgment List]] - **Infrastructure**: [[Vector Database]], [[Elasticsearch]], [[Ollama_Local_LLM_Setup_Guide|Ollama]] --- *Last updated: 2026-05-08* ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) **์ถ”์ถœ๋œ ํŒจํ„ด:** > *(TODO)* **์„ธ๋ถ€ ๋‚ด์šฉ:** - *(TODO)* ## ๐Ÿค– LLM ํ™œ์šฉ ํžŒํŠธ (How to Use This Knowledge) **์–ธ์ œ ์ด ์ง€์‹์„ ์“ฐ๋Š”๊ฐ€:** - *(TODO)* **์–ธ์ œ ์“ฐ๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€:** - *(TODO)* ## ๐Ÿงช ๊ฒ€์ฆ ์ƒํƒœ (Validation) - **์ •๋ณด ์ƒํƒœ:** needs_review - **์ถœ์ฒ˜ ์‹ ๋ขฐ๋„:** A - **๊ฒ€ํ†  ์ด์œ :** *(P-Reinforce Phase 1 ์ž๋™ ์ •๊ทœํ™”. ๋ณธ๋ฌธ ๊ฒ€์ฆ ํ•„์š”.)* ## ๐Ÿงฌ ์ค‘๋ณต ๊ฒ€์‚ฌ (Duplicate Check) - **๊ธฐ์กด ์œ ์‚ฌ ๋ฌธ์„œ:** *(TODO: ์ธ๋ฑ์„œ ํด๋Ÿฌ์Šคํ„ฐ ๋ฆฌํฌํŠธ ์ฐธ์กฐ)* - **์ฒ˜๋ฆฌ ๋ฐฉ์‹:** UPDATE (์ž๋™ ์ •๊ทœํ™”) - **์ฒ˜๋ฆฌ ์ด์œ :** Phase 1 ์ •๊ทœํ™” โ€” ์˜› ํ…œํ”Œ๋ฆฟ/๋ˆ„๋ฝ ํ•„๋“œ ๋ณด๊ฐ•. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & Updates) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ:** ์—†์Œ - **์ •์ฑ… ๋ณ€ํ™”:** ์—†์Œ ## ๐Ÿ•“ ๋ณ€๊ฒฝ ์ด๋ ฅ (Changelog) | ๋‚ ์งœ | ๋ณ€๊ฒฝ ๋‚ด์šฉ | ์ฒ˜๋ฆฌ ๋ฐฉ์‹ | ์‹ ๋ขฐ๋„ | |------|-----------|-----------|--------| | 2026-05-08 | P-Reinforce Phase 1 ์ •๊ทœํ™” (frontmatter + ํ—ค๋” ํ‘œ์ค€ํ™”) | UPDATE | A | ## ๐Ÿ’ป ์ฝ”๋“œ ํŒจํ„ด (Code Patterns) **ํŒจํ„ด 1:** *(TODO: ์ด ํ”„๋กœ์ ํŠธ ์ปจ๋ฒค์…˜ ๋ฐ˜์˜ํ•œ ๊ตฌ์กฐ ์Šค์ผˆ๋ ˆํ†ค)* ```text # TODO ``` ## ๐Ÿค” ์˜์‚ฌ๊ฒฐ์ • ๊ธฐ์ค€ (Decision Criteria) **์„ ํƒ A๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **์„ ํƒ B๋ฅผ ์จ์•ผ ํ•  ๋•Œ:** - *(TODO)* **๊ธฐ๋ณธ๊ฐ’:** > *(TODO)* ## โŒ ์•ˆํ‹ฐํŒจํ„ด (Anti-Patterns) - **[์•ˆํ‹ฐํŒจํ„ด]:** *(TODO: ๋ฌด์—‡์„ ํ•˜๋ฉด ์•ˆ ๋˜๋Š”๊ฐ€ + ์ด์œ  + ๋Œ€์‹  ๋ฌด์—‡์„)*