--- id: wiki-2026-0508-knowledge-graph title: Knowledge Graph category: 10_Wiki/Topics status: verified canonical_id: self aliases: [KG, Semantic Graph, Entity Graph] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [knowledge-graph, graph, semantic, retrieval] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: networkx, neo4j, rdflib --- # Knowledge Graph ## 매 한 줄 > **"매 entity-relationship triples 의 graph"**. Knowledge Graph 는 (head, relation, tail) triple 의 collection 으로 구조화된 knowledge 를 저장하는 graph database paradigm. 2012 Google 의 도입 이후 search/RAG/agents 의 backbone 으로 자리잡았으며, 2026 LLM era 에서는 GraphRAG 와 entity-linking 으로 hallucination mitigation 에 사용. ## 매 핵심 ### 매 Triple 구조 - (subject, predicate, object) — RDF 표준 - entity ID (e.g. wikidata Q-id) → unique reference - relation typed (employs, locatedIn, instanceOf, …) - property graph: edges 도 attributes 보유 ### 매 Schema vs Schema-less - ontology-driven: OWL, schema.org → strict typing - LPG (labeled property graph): Neo4j flexible - emergent KG: LLM 으로 unstructured text 에서 자동 추출 ### 매 응용 1. Search ranking (Google KG panels). 2. RAG with GraphRAG (Microsoft 2024). 3. Agent tool: entity disambiguation. 4. Recommendation (LinkedIn Economic Graph). 5. Drug discovery (Hetionet). ## 💻 패턴 ### NetworkX 로 KG build ```python import networkx as nx G = nx.MultiDiGraph() G.add_edge("Anthropic", "Claude", relation="created") G.add_edge("Claude", "LLM", relation="instanceOf") G.add_edge("Anthropic", "San Francisco", relation="locatedIn") # query: what did Anthropic create? for _, target, data in G.out_edges("Anthropic", data=True): if data["relation"] == "created": print(target) # Claude ``` ### LLM 으로 triple 추출 ```python from anthropic import Anthropic client = Anthropic() text = "Claude Opus 4.7 was released by Anthropic in 2026." resp = client.messages.create( model="claude-opus-4-7", max_tokens=512, messages=[{ "role": "user", "content": f"Extract (subject, predicate, object) triples as JSON from: {text}" }] ) # → [["Claude Opus 4.7","releasedBy","Anthropic"], ...] ``` ### Neo4j Cypher query ```cypher // find all 2-hop neighbors of Anthropic MATCH (a:Org {name:"Anthropic"})-[r1]->(x)-[r2]->(y) RETURN a, r1, x, r2, y LIMIT 100; ``` ### RDF + SPARQL ```python from rdflib import Graph, URIRef, Literal g = Graph() g.parse("dbpedia.ttl", format="turtle") q = """ SELECT ?company WHERE { ?company a ; "2021"^^xsd:gYear . } """ for row in g.query(q): print(row.company) ``` ### GraphRAG retrieval ```python def graph_rag_query(q: str, kg, llm): entities = llm.extract_entities(q) subgraph = kg.k_hop_subgraph(entities, k=2) context = subgraph.to_text() return llm.answer(q, context=context) ``` ### Embedding-based KG completion (TransE) ```python import torch import torch.nn as nn class TransE(nn.Module): def __init__(self, n_ent, n_rel, dim=128): super().__init__() self.ent = nn.Embedding(n_ent, dim) self.rel = nn.Embedding(n_rel, dim) def score(self, h, r, t): return -torch.norm(self.ent(h) + self.rel(r) - self.ent(t), p=2, dim=-1) ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | 작은 domain, fast prototype | NetworkX in-memory | | production, ACID | Neo4j | | W3C standards | RDF + SPARQL | | billion-scale, distributed | JanusGraph, TigerGraph | | LLM RAG | GraphRAG (Microsoft) | **기본값**: Neo4j + LLM extraction pipeline. ## 🔗 Graph - 부모: [[Knowledge-Graph-Foundations]] · [[Graph-Theory]] - 변형: [[GraphRAG]] · [[Entity-Linking]] · [[Ontology]] - 응용: [[RAG-Architecture]] · [[Semantic-Search]] · [[Recommendation-Systems]] - Adjacent: [[Vector-Database]] · [[Embeddings]] ## 🤖 LLM 활용 **언제**: factual grounding, multi-hop reasoning, entity disambiguation 필요 시. **언제 X**: pure semantic similarity 만 필요할 때 — vector DB 가 더 simple. ## ❌ 안티패턴 - **Schema explosion**: 매 entity 마다 new relation 정의 → unmanageable. - **Stale KG**: 자동 update pipeline 없이 manual curation → 6 months 지나면 obsolete. - **No entity resolution**: "Anthropic" vs "anthropic Inc." vs "ANTHROPIC" → duplicate nodes. - **Triple-only thinking**: property graph 의 edge attribute 무시. ## 🧪 검증 / 중복 - Verified (Bollacker 2008 Freebase, Hogan 2021 KG survey ACM CSUR). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — KG fundamentals, triples, GraphRAG, Neo4j patterns |