"매 모든 답은 누군가 이미 reformulated". Research는 매 question → literature → synthesis → novel contribution의 매 disciplined loop — 2026 의 매 AI-aided synthesis (Claude Opus 4.7 deep research, GPT-5 with browsing, Elicit, Consensus, undermind.ai) 가 매 weeks of work 를 매 hours로 단축.
AI synthesis: Claude Opus 4.7 deep-research mode, GPT-5 deep research, Elicit (extracts data per paper), Consensus (claim-level), undermind.ai (deep retrieval).
fromanthropicimportAnthropicimporthttpxclient=Anthropic()defsynthesize(question:str,papers:list[dict])->str:"""papers: [{title, abstract, doi, year}]"""corpus="\n\n".join(f"[{i}] {p['title']} ({p['year']}, doi:{p['doi']})\n{p['abstract']}"fori,pinenumerate(papers))msg=client.messages.create(model="claude-opus-4-7",max_tokens=4096,system=("Synthesize evidence. Cite EVERY claim with [index]. ""If evidence is weak/contradictory, say so explicitly. ""Never fabricate citations."),messages=[{"role":"user","content":f"Q: {question}\n\nPapers:\n{corpus}"}],)returnmsg.content[0].textdefverify_dois(text:str,papers:list[dict])->list[str]:"""Hallucination check — every cited DOI must exist in our set."""importrecited=re.findall(r"doi:(10\.\d+/\S+)",text)valid={p["doi"]forpinpapers}return[dfordincitedifdnotinvalid]# offenders
EXTRACT_PROMPT="""Extract from this paper as JSON:
{
"claim": "main thesis in one sentence",
"method": "how they tested it",
"evidence": "key result with numbers",
"n": "sample size",
"limitations": ["limit1", "limit2"],
"novelty": "what this adds vs prior work"
}
If field unknown, use null. Don't invent."""
Steelman opposite (debias)
defsteelman(claim:str)->str:returnclient.messages.create(model="claude-opus-4-7",max_tokens=1024,messages=[{"role":"user","content":f"Claim: {claim}\n\nWrite the strongest argument AGAINST this, "f"citing actual contrary evidence. Be a hostile reviewer."}],).content[0].text
Zettelkasten note (atomic)
---
id: 2026-05-10-1432
tags: [retrieval, rag]
source: [[Lewis-2020-RAG]]
---
# Dense retrieval beats BM25 only when query-doc lexical overlap is low
In Lewis 2020 (Table 3), DPR > BM25 on NaturalQuestions (+6 EM)
but BM25 ≥ DPR on TriviaQA where queries copy doc tokens.
→ Hybrid search is robust: pick BM25 for lexical, dense for paraphrase.
Connects to: [[Hybrid Search]] · [[BM25]] · [[Dense-Retrieval]]
언제: literature scan, abstract screening, structured extraction, synthesis draft, steelmanning.
언제 X: novelty claim 의 매 final assertion (LLM 의 매 ground truth 의 X), 매 quantitative meta-analysis (use proper stats software), 매 citation 의 verify 없이.
❌ 안티패턴
Cite-without-verify: AI 의 매 만들어낸 fake DOI.
Single-source synthesis: 매 한 paper 의 매 truth로 취급 — 매 replication 의 무시.
Recency bias: 매 latest preprint 만 → 매 foundational work 의 무지.
No gap analysis: literature dump 의 매 only — 매 "what's missing" 의 부재 → contribution 의 unclear.
Hypothesis fishing: 매 data 부터 → 매 post-hoc theory (HARKing).
🧪 검증 / 중복
Verified (PRISMA 2020 statement, Semantic Scholar API docs, Claude Opus 4.7 deep research, Elicit methodology).