Files

T

koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)

이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-08 12:24:15 +09:00

5.8 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Reranking

매 한 줄

"매 retrieval은 recall, 매 rerank는 precision". Reranking은 매 first-stage retrieval (BM25/dense) 에서 매 top-k candidates를 매 expensive cross-encoder/LLM으로 매 re-score — RAG quality 의 매 single biggest lever in 2026 (Cohere Rerank 4, BGE-Reranker-v2.5, Voyage rerank-3).

매 핵심

매 왜 필요

Bi-encoder (dense retrieval): query, doc를 매 separately encode → cosine. Fast (cached doc embeddings) but 매 shallow interaction.
Cross-encoder: [query, doc] 의 매 jointly encode → scalar score. 매 deep token-level attention → +10–30% NDCG.
Trade-off: O(N) cross-encoder 의 매 too slow → first-stage retrieve top-100, rerank to top-5.

매 Architectures

Cross-encoder (BERT-based): [CLS] q [SEP] d [SEP] → linear → score. BGE-Reranker-v2.5, Cohere Rerank 4, Voyage rerank-3.
ColBERT / late interaction: doc의 매 token-level embeddings 매 미리 계산 → query token이 매 max-sim로 score. Cross-encoder의 매 ~80% quality at retrieval-speed.
LLM-as-reranker: prompt 의 GPT-5/Claude 매 listwise rank. RankGPT, RankZephyr 매 paradigm — 매 quality 최고지만 매 가장 비쌈.
RRF (Reciprocal Rank Fusion): cheap fusion of multiple rankers — score(d) = Σ 1/(k+rank_i(d)).

매 Hybrid Search Stack (2026 standard)

BM25 (sparse) + Dense (e.g., BGE-M3) → parallel.
RRF fuse → top-100.
Cross-encoder rerank → top-10.
(Optional) LLM rerank → top-3 for high-stakes.

매 응용

RAG 의 매 답변 정확도 ↑.
E-commerce search relevance.
Legal/medical document discovery (precision-critical).
Code search (semantic + lexical hybrid).

💻 패턴

Cross-encoder rerank (sentence-transformers)

from sentence_transformers import CrossEncoder

reranker = CrossEncoder("BAAI/bge-reranker-v2.5-gemma2-lightweight")

def rerank(query: str, candidates: list[str], top_k: int = 5):
    pairs = [[query, doc] for doc in candidates]
    scores = reranker.predict(pairs)  # numpy array
    ranked = sorted(zip(candidates, scores), key=lambda x: -x[1])
    return ranked[:top_k]

Cohere Rerank API

import cohere
co = cohere.Client()

def cohere_rerank(query: str, docs: list[str], top_n: int = 5):
    resp = co.rerank(
        model="rerank-v4.0",
        query=query, documents=docs, top_n=top_n,
    )
    return [(docs[r.index], r.relevance_score) for r in resp.results]

Reciprocal Rank Fusion

def rrf(rankings: list[list[str]], k: int = 60) -> list[str]:
    """rankings: list of ranked doc-id lists from different retrievers."""
    scores: dict[str, float] = {}
    for ranking in rankings:
        for rank, doc_id in enumerate(ranking):
            scores[doc_id] = scores.get(doc_id, 0) + 1 / (k + rank + 1)
    return sorted(scores, key=scores.get, reverse=True)

Hybrid retrieve + rerank pipeline

def hybrid_rag(query: str, k_first=100, k_final=5):
    bm25_hits = bm25.search(query, top_k=k_first)
    dense_hits = dense_index.search(query, top_k=k_first)
    fused = rrf([bm25_hits, dense_hits])[:k_first]
    docs = [load_doc(d) for d in fused]
    return rerank(query, docs, top_k=k_final)

LLM-as-reranker (listwise)

def llm_rerank(query: str, docs: list[str]) -> list[int]:
    numbered = "\n".join(f"[{i}] {d[:300]}" for i, d in enumerate(docs))
    resp = client.messages.create(
        model="claude-opus-4-7", max_tokens=200,
        messages=[{"role": "user", "content":
            f"Query: {query}\nDocs:\n{numbered}\nReturn comma-separated indices best→worst."}],
    ).content[0].text
    return [int(x) for x in resp.strip().split(",")]

ColBERT late-interaction (RAGatouille)

from ragatouille import RAGPretrainedModel

rag = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.5")
rag.index(collection=docs, index_name="my-index")
results = rag.search(query="foo", k=10)

매 결정 기준

상황	Approach
Cost-sensitive RAG	BM25 + dense → RRF (no rerank)
Quality > latency	Hybrid + cross-encoder rerank
Highest quality	+ LLM rerank top-20 → top-3
거대 corpus (>10M docs)	ColBERT for second stage
Multilingual	BGE-Reranker-v2.5 / Cohere rerank-v4

기본값: BM25 + BGE-M3 dense → RRF top-100 → BGE-Reranker-v2.5 top-5.

🔗 Graph

부모: Information Retrieval · RAG
변형: ColBERT · RRF
응용: Semantic Search · Hybrid Search
Adjacent: BM25 · Dense-Retrieval · Embeddings

🤖 LLM 활용

언제: high-stakes RAG (legal/medical/finance), small candidate set, listwise. 언제 X: 매 latency budget < 100ms, 매 large k (cost), 매 simple FAQ chat (overkill).

❌ 안티패턴

Rerank without first-stage filter: O(N) on full corpus → cost explosion.
Cross-encoder for indexing: 매 doc embeddings 의 매 cache 의 X — 매 query마다 recompute.
Pointwise LLM rerank: 매 doc 별 separate call → listwise보다 매 비싸고 inconsistent.
Ignoring score calibration: cross-encoder score는 매 not probability — threshold 매 dataset-specific tuning 필요.

🧪 검증 / 중복

Verified (Cohere docs, BGE paper, ColBERT v2.5, RankGPT/RankZephyr).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — full rewrite as canonical for cross-encoder/ColBERT/RRF/LLM rerank

5.8 KiB Raw Blame History Unescape Escape