--- id: wiki-2026-0508-secondary-research title: Secondary Research category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Desk Research, Literature Review, Existing-Data Analysis] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [research, methodology, literature-review, knowledge-synthesis] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: agnostic framework: research-methods --- # Secondary Research ## 매 한 줄 > **"매 secondary research = 매 existing 의 published / collected data 의 매 analysis"**. 매 primary research (raw 새 data 수집) 의 반대. 매 lit review, 매 meta-analysis, 매 industry report 분석, 매 dataset reuse 다 포함. 매 2026 년 LLM-assisted secondary research 가 매 dominant — 매 single researcher 의 매 weeks → 매 hours. ## 매 핵심 ### 매 vs primary research - **Primary**: 매 직접 collect — survey, interview, experiment, observation. 매 control 큼, 매 cost 큼. - **Secondary**: 매 already-published 의 reuse — books, papers, gov stats, industry reports, internal docs. 매 cheap, 매 fast, 매 control 작음. ### 매 source taxonomy - **Academic**: peer-reviewed papers (PubMed, arXiv, Google Scholar, Semantic Scholar, OpenAlex). - **Government**: census, BLS, OECD, World Bank, KOSIS. - **Industry**: Gartner, Forrester, IDC, McKinsey, CB Insights, Statista. - **Internal**: company analytics, post-mortems, design docs. - **Community**: HN, Reddit, GitHub, blog posts (lower trust, higher recency). ### 매 응용 1. **Lit review**: 매 새 paper 의 매 background section. 2. **Market analysis**: 매 startup 의 매 TAM/SAM/SOM 추정. 3. **Competitor research**: 매 product strategy 의 매 input. 4. **Meta-analysis**: 매 multiple studies 의 매 effect size 통합. 5. **Due diligence**: 매 investment / 매 acquisition 의 매 background. ## 💻 패턴 ### Pattern 1: LLM-assisted lit review ```python import anthropic, asyncio client = anthropic.AsyncAnthropic() async def summarize_paper(abstract: str, question: str): msg = await client.messages.create( model="claude-opus-4-7", max_tokens=512, system="You are a careful research assistant. Cite verbatim.", messages=[{ "role": "user", "content": f"Question: {question}\n\nAbstract:\n{abstract}\n\nIs this relevant? If yes, extract key findings + methodology in 3 bullets.", }], ) return msg.content[0].text async def lit_review(question: str, abstracts: list[str]): results = await asyncio.gather(*[summarize_paper(a, question) for a in abstracts]) return [r for r in results if "not relevant" not in r.lower()] ``` ### Pattern 2: arXiv / Semantic Scholar fetch ```python import requests def search_semantic_scholar(query: str, limit=20): r = requests.get( "https://api.semanticscholar.org/graph/v1/paper/search", params={ "query": query, "limit": limit, "fields": "title,abstract,year,authors,citationCount,openAccessPdf", }, ) return r.json()["data"] ``` ### Pattern 3: Citation graph traversal ```python def expand_citations(seed_papers, depth=2): frontier = list(seed_papers) seen = set(p["paperId"] for p in seed_papers) for _ in range(depth): next_frontier = [] for paper in frontier: r = requests.get( f"https://api.semanticscholar.org/graph/v1/paper/{paper['paperId']}/references", params={"fields": "title,abstract,year,citationCount"}, ) for ref in r.json().get("data", []): pid = ref["citedPaper"]["paperId"] if pid and pid not in seen: seen.add(pid) next_frontier.append(ref["citedPaper"]) frontier = next_frontier return list(seen) ``` ### Pattern 4: Source-trust scoring ```python def trust_score(source: dict) -> float: base = { "peer-reviewed": 0.9, "preprint": 0.7, "government": 0.85, "industry-paid": 0.6, "blog": 0.4, "social": 0.2, }.get(source["type"], 0.3) age_yrs = 2026 - source["year"] decay = max(0.5, 1 - 0.05 * age_yrs) citations = min(1.0, source.get("citations", 0) / 100) return base * decay * (0.6 + 0.4 * citations) ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | 매 새 topic 빠른 overview | LLM survey + 매 5-10 review papers | | 매 medical / safety claim | Cochrane / systematic review only | | 매 market size estimation | Triangulate 3+ sources (Gartner + government + internal) | | 매 historical trend | Government/longitudinal data | | 매 cutting-edge tech | arXiv (acknowledge non-peer-reviewed) | **기본값**: 매 source diversification — 매 single source 의 매 trust X. 매 triangulate ≥3. ## 🔗 Graph - 부모: [[Research Methodology]] - 응용: [[Literature Review]] - Adjacent: [[Knowledge Synthesis]] ## 🤖 LLM 활용 **언제**: 매 abstract 의 매 relevance filter, 매 cross-paper synthesis, 매 lit review draft. **언제 X**: 매 LLM 의 매 hallucinated citations — 매 always 매 source verify. ## ❌ 안티패턴 - **Single-source bias**: 매 매 1 paper / 매 1 industry report 만 의 매 conclusion. - **Citation laundering**: 매 LLM 생성 citation 의 매 unverified copy-paste. - **Stale data**: 매 fast-moving field (LLM, crypto) 의 매 2-yr-old report 의 매 current 처럼 사용. ## 🧪 검증 / 중복 - Verified (Cooper *Research Synthesis and Meta-Analysis* 5th ed; PRISMA 2020 guidelines). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — Secondary Research 의 vs primary, source taxonomy, LLM lit-review pipeline, citation graph, trust scoring 정리 |