Files
2nd/10_Wiki/Topics/Other/Secondary-Research.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

5.8 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-secondary-research Secondary Research 10_Wiki/Topics verified self
Desk Research
Literature Review
Existing-Data Analysis
none A 0.9 applied
research
methodology
literature-review
knowledge-synthesis
2026-05-10 pending
language framework
agnostic research-methods

Secondary Research

매 한 줄

"매 secondary research = 매 existing 의 published / collected data 의 매 analysis". 매 primary research (raw 새 data 수집) 의 반대. 매 lit review, 매 meta-analysis, 매 industry report 분석, 매 dataset reuse 다 포함. 매 2026 년 LLM-assisted secondary research 가 매 dominant — 매 single researcher 의 매 weeks → 매 hours.

매 핵심

매 vs primary research

  • Primary: 매 직접 collect — survey, interview, experiment, observation. 매 control 큼, 매 cost 큼.
  • Secondary: 매 already-published 의 reuse — books, papers, gov stats, industry reports, internal docs. 매 cheap, 매 fast, 매 control 작음.

매 source taxonomy

  • Academic: peer-reviewed papers (PubMed, arXiv, Google Scholar, Semantic Scholar, OpenAlex).
  • Government: census, BLS, OECD, World Bank, KOSIS.
  • Industry: Gartner, Forrester, IDC, McKinsey, CB Insights, Statista.
  • Internal: company analytics, post-mortems, design docs.
  • Community: HN, Reddit, GitHub, blog posts (lower trust, higher recency).

매 응용

  1. Lit review: 매 새 paper 의 매 background section.
  2. Market analysis: 매 startup 의 매 TAM/SAM/SOM 추정.
  3. Competitor research: 매 product strategy 의 매 input.
  4. Meta-analysis: 매 multiple studies 의 매 effect size 통합.
  5. Due diligence: 매 investment / 매 acquisition 의 매 background.

💻 패턴

Pattern 1: LLM-assisted lit review

import anthropic, asyncio

client = anthropic.AsyncAnthropic()

async def summarize_paper(abstract: str, question: str):
    msg = await client.messages.create(
        model="claude-opus-4-7",
        max_tokens=512,
        system="You are a careful research assistant. Cite verbatim.",
        messages=[{
            "role": "user",
            "content": f"Question: {question}\n\nAbstract:\n{abstract}\n\nIs this relevant? If yes, extract key findings + methodology in 3 bullets.",
        }],
    )
    return msg.content[0].text

async def lit_review(question: str, abstracts: list[str]):
    results = await asyncio.gather(*[summarize_paper(a, question) for a in abstracts])
    return [r for r in results if "not relevant" not in r.lower()]

Pattern 2: arXiv / Semantic Scholar fetch

import requests

def search_semantic_scholar(query: str, limit=20):
    r = requests.get(
        "https://api.semanticscholar.org/graph/v1/paper/search",
        params={
            "query": query,
            "limit": limit,
            "fields": "title,abstract,year,authors,citationCount,openAccessPdf",
        },
    )
    return r.json()["data"]

Pattern 3: Citation graph traversal

def expand_citations(seed_papers, depth=2):
    frontier = list(seed_papers)
    seen = set(p["paperId"] for p in seed_papers)
    for _ in range(depth):
        next_frontier = []
        for paper in frontier:
            r = requests.get(
                f"https://api.semanticscholar.org/graph/v1/paper/{paper['paperId']}/references",
                params={"fields": "title,abstract,year,citationCount"},
            )
            for ref in r.json().get("data", []):
                pid = ref["citedPaper"]["paperId"]
                if pid and pid not in seen:
                    seen.add(pid)
                    next_frontier.append(ref["citedPaper"])
        frontier = next_frontier
    return list(seen)

Pattern 4: Source-trust scoring

def trust_score(source: dict) -> float:
    base = {
        "peer-reviewed": 0.9,
        "preprint": 0.7,
        "government": 0.85,
        "industry-paid": 0.6,
        "blog": 0.4,
        "social": 0.2,
    }.get(source["type"], 0.3)
    age_yrs = 2026 - source["year"]
    decay = max(0.5, 1 - 0.05 * age_yrs)
    citations = min(1.0, source.get("citations", 0) / 100)
    return base * decay * (0.6 + 0.4 * citations)

매 결정 기준

상황 Approach
매 새 topic 빠른 overview LLM survey + 매 5-10 review papers
매 medical / safety claim Cochrane / systematic review only
매 market size estimation Triangulate 3+ sources (Gartner + government + internal)
매 historical trend Government/longitudinal data
매 cutting-edge tech arXiv (acknowledge non-peer-reviewed)

기본값: 매 source diversification — 매 single source 의 매 trust X. 매 triangulate ≥3.

🔗 Graph

🤖 LLM 활용

언제: 매 abstract 의 매 relevance filter, 매 cross-paper synthesis, 매 lit review draft. 언제 X: 매 LLM 의 매 hallucinated citations — 매 always 매 source verify.

안티패턴

  • Single-source bias: 매 매 1 paper / 매 1 industry report 만 의 매 conclusion.
  • Citation laundering: 매 LLM 생성 citation 의 매 unverified copy-paste.
  • Stale data: 매 fast-moving field (LLM, crypto) 의 매 2-yr-old report 의 매 current 처럼 사용.

🧪 검증 / 중복

  • Verified (Cooper Research Synthesis and Meta-Analysis 5th ed; PRISMA 2020 guidelines).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — Secondary Research 의 vs primary, source taxonomy, LLM lit-review pipeline, citation graph, trust scoring 정리