f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
159 lines
5.8 KiB
Markdown
159 lines
5.8 KiB
Markdown
---
|
|
id: wiki-2026-0508-secondary-research
|
|
title: Secondary Research
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [Desk Research, Literature Review, Existing-Data Analysis]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [research, methodology, literature-review, knowledge-synthesis]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: agnostic
|
|
framework: research-methods
|
|
---
|
|
|
|
# Secondary Research
|
|
|
|
## 매 한 줄
|
|
> **"매 secondary research = 매 existing 의 published / collected data 의 매 analysis"**. 매 primary research (raw 새 data 수집) 의 반대. 매 lit review, 매 meta-analysis, 매 industry report 분석, 매 dataset reuse 다 포함. 매 2026 년 LLM-assisted secondary research 가 매 dominant — 매 single researcher 의 매 weeks → 매 hours.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 vs primary research
|
|
- **Primary**: 매 직접 collect — survey, interview, experiment, observation. 매 control 큼, 매 cost 큼.
|
|
- **Secondary**: 매 already-published 의 reuse — books, papers, gov stats, industry reports, internal docs. 매 cheap, 매 fast, 매 control 작음.
|
|
|
|
### 매 source taxonomy
|
|
- **Academic**: peer-reviewed papers (PubMed, arXiv, Google Scholar, Semantic Scholar, OpenAlex).
|
|
- **Government**: census, BLS, OECD, World Bank, KOSIS.
|
|
- **Industry**: Gartner, Forrester, IDC, McKinsey, CB Insights, Statista.
|
|
- **Internal**: company analytics, post-mortems, design docs.
|
|
- **Community**: HN, Reddit, GitHub, blog posts (lower trust, higher recency).
|
|
|
|
### 매 응용
|
|
1. **Lit review**: 매 새 paper 의 매 background section.
|
|
2. **Market analysis**: 매 startup 의 매 TAM/SAM/SOM 추정.
|
|
3. **Competitor research**: 매 product strategy 의 매 input.
|
|
4. **Meta-analysis**: 매 multiple studies 의 매 effect size 통합.
|
|
5. **Due diligence**: 매 investment / 매 acquisition 의 매 background.
|
|
|
|
## 💻 패턴
|
|
|
|
### Pattern 1: LLM-assisted lit review
|
|
```python
|
|
import anthropic, asyncio
|
|
|
|
client = anthropic.AsyncAnthropic()
|
|
|
|
async def summarize_paper(abstract: str, question: str):
|
|
msg = await client.messages.create(
|
|
model="claude-opus-4-7",
|
|
max_tokens=512,
|
|
system="You are a careful research assistant. Cite verbatim.",
|
|
messages=[{
|
|
"role": "user",
|
|
"content": f"Question: {question}\n\nAbstract:\n{abstract}\n\nIs this relevant? If yes, extract key findings + methodology in 3 bullets.",
|
|
}],
|
|
)
|
|
return msg.content[0].text
|
|
|
|
async def lit_review(question: str, abstracts: list[str]):
|
|
results = await asyncio.gather(*[summarize_paper(a, question) for a in abstracts])
|
|
return [r for r in results if "not relevant" not in r.lower()]
|
|
```
|
|
|
|
### Pattern 2: arXiv / Semantic Scholar fetch
|
|
```python
|
|
import requests
|
|
|
|
def search_semantic_scholar(query: str, limit=20):
|
|
r = requests.get(
|
|
"https://api.semanticscholar.org/graph/v1/paper/search",
|
|
params={
|
|
"query": query,
|
|
"limit": limit,
|
|
"fields": "title,abstract,year,authors,citationCount,openAccessPdf",
|
|
},
|
|
)
|
|
return r.json()["data"]
|
|
```
|
|
|
|
### Pattern 3: Citation graph traversal
|
|
```python
|
|
def expand_citations(seed_papers, depth=2):
|
|
frontier = list(seed_papers)
|
|
seen = set(p["paperId"] for p in seed_papers)
|
|
for _ in range(depth):
|
|
next_frontier = []
|
|
for paper in frontier:
|
|
r = requests.get(
|
|
f"https://api.semanticscholar.org/graph/v1/paper/{paper['paperId']}/references",
|
|
params={"fields": "title,abstract,year,citationCount"},
|
|
)
|
|
for ref in r.json().get("data", []):
|
|
pid = ref["citedPaper"]["paperId"]
|
|
if pid and pid not in seen:
|
|
seen.add(pid)
|
|
next_frontier.append(ref["citedPaper"])
|
|
frontier = next_frontier
|
|
return list(seen)
|
|
```
|
|
|
|
### Pattern 4: Source-trust scoring
|
|
```python
|
|
def trust_score(source: dict) -> float:
|
|
base = {
|
|
"peer-reviewed": 0.9,
|
|
"preprint": 0.7,
|
|
"government": 0.85,
|
|
"industry-paid": 0.6,
|
|
"blog": 0.4,
|
|
"social": 0.2,
|
|
}.get(source["type"], 0.3)
|
|
age_yrs = 2026 - source["year"]
|
|
decay = max(0.5, 1 - 0.05 * age_yrs)
|
|
citations = min(1.0, source.get("citations", 0) / 100)
|
|
return base * decay * (0.6 + 0.4 * citations)
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | Approach |
|
|
|---|---|
|
|
| 매 새 topic 빠른 overview | LLM survey + 매 5-10 review papers |
|
|
| 매 medical / safety claim | Cochrane / systematic review only |
|
|
| 매 market size estimation | Triangulate 3+ sources (Gartner + government + internal) |
|
|
| 매 historical trend | Government/longitudinal data |
|
|
| 매 cutting-edge tech | arXiv (acknowledge non-peer-reviewed) |
|
|
|
|
**기본값**: 매 source diversification — 매 single source 의 매 trust X. 매 triangulate ≥3.
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[Research Methodology]]
|
|
- 응용: [[Literature Review]]
|
|
- Adjacent: [[Knowledge Synthesis]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: 매 abstract 의 매 relevance filter, 매 cross-paper synthesis, 매 lit review draft.
|
|
**언제 X**: 매 LLM 의 매 hallucinated citations — 매 always 매 source verify.
|
|
|
|
## ❌ 안티패턴
|
|
- **Single-source bias**: 매 매 1 paper / 매 1 industry report 만 의 매 conclusion.
|
|
- **Citation laundering**: 매 LLM 생성 citation 의 매 unverified copy-paste.
|
|
- **Stale data**: 매 fast-moving field (LLM, crypto) 의 매 2-yr-old report 의 매 current 처럼 사용.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (Cooper *Research Synthesis and Meta-Analysis* 5th ed; PRISMA 2020 guidelines).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — Secondary Research 의 vs primary, source taxonomy, LLM lit-review pipeline, citation graph, trust scoring 정리 |
|