Files

T

koriweb 95cd8bb891 feat(wiki): 코드 그라운딩 23문서 + MOC 학습지도 39개

- 코드 그라운딩: 기술 주제 문서의 '적용 사례'에 실제 레포 구현 위치
  (file:line)+커밋 자동 주입 (예: 문서 청킹 전략→connectai/src/retrieval/chunker.ts).
  멱등 마커(CODE-GROUNDING)로 재실행 시 갱신.
- MOC: 39개 클러스터 폴더에 _MOC.md 학습지도 생성(진입점+통찰 주석).
도구: Datacollect/scripts/{code_grounding,moc_generator}.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-08 18:56:11 +09:00

5.4 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Knowledge Graph

매 한 줄

"매 entity-relationship triples 의 graph". Knowledge Graph 는 (head, relation, tail) triple 의 collection 으로 구조화된 knowledge 를 저장하는 graph database paradigm. 2012 Google 의 도입 이후 search/RAG/agents 의 backbone 으로 자리잡았으며, 2026 LLM era 에서는 GraphRAG 와 entity-linking 으로 hallucination mitigation 에 사용.

매 핵심

매 Triple 구조

(subject, predicate, object) — RDF 표준
entity ID (e.g. wikidata Q-id) → unique reference
relation typed (employs, locatedIn, instanceOf, …)
property graph: edges 도 attributes 보유

매 Schema vs Schema-less

ontology-driven: OWL, schema.org → strict typing
LPG (labeled property graph): Neo4j flexible
emergent KG: LLM 으로 unstructured text 에서 자동 추출

매 응용

Search ranking (Google KG panels).
RAG with GraphRAG (Microsoft 2024).
Agent tool: entity disambiguation.
Recommendation (LinkedIn Economic Graph).
Drug discovery (Hetionet).

💻 패턴

NetworkX 로 KG build

import networkx as nx

G = nx.MultiDiGraph()
G.add_edge("Anthropic", "Claude", relation="created")
G.add_edge("Claude", "LLM", relation="instanceOf")
G.add_edge("Anthropic", "San Francisco", relation="locatedIn")

# query: what did Anthropic create?
for _, target, data in G.out_edges("Anthropic", data=True):
    if data["relation"] == "created":
        print(target)  # Claude

LLM 으로 triple 추출

from anthropic import Anthropic

client = Anthropic()
text = "Claude Opus 4.7 was released by Anthropic in 2026."

resp = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=512,
    messages=[{
        "role": "user",
        "content": f"Extract (subject, predicate, object) triples as JSON from: {text}"
    }]
)
# → [["Claude Opus 4.7","releasedBy","Anthropic"], ...]

Neo4j Cypher query

// find all 2-hop neighbors of Anthropic
MATCH (a:Org {name:"Anthropic"})-[r1]->(x)-[r2]->(y)
RETURN a, r1, x, r2, y
LIMIT 100;

RDF + SPARQL

from rdflib import Graph, URIRef, Literal
g = Graph()
g.parse("dbpedia.ttl", format="turtle")
q = """
SELECT ?company WHERE {
    ?company a <http://dbpedia.org/ontology/Company> ;
             <http://dbpedia.org/property/foundedYear> "2021"^^xsd:gYear .
}
"""
for row in g.query(q):
    print(row.company)

GraphRAG retrieval

def graph_rag_query(q: str, kg, llm):
    entities = llm.extract_entities(q)
    subgraph = kg.k_hop_subgraph(entities, k=2)
    context = subgraph.to_text()
    return llm.answer(q, context=context)

Embedding-based KG completion (TransE)

import torch
import torch.nn as nn

class TransE(nn.Module):
    def __init__(self, n_ent, n_rel, dim=128):
        super().__init__()
        self.ent = nn.Embedding(n_ent, dim)
        self.rel = nn.Embedding(n_rel, dim)
    def score(self, h, r, t):
        return -torch.norm(self.ent(h) + self.rel(r) - self.ent(t), p=2, dim=-1)

매 결정 기준

상황	Approach
작은 domain, fast prototype	NetworkX in-memory
production, ACID	Neo4j
W3C standards	RDF + SPARQL
billion-scale, distributed	JanusGraph, TigerGraph
LLM RAG	GraphRAG (Microsoft)

기본값: Neo4j + LLM extraction pipeline.

🔗 Graph

부모: Knowledge Graph · Graph_Theory
변형: GraphRAG · Ontology
응용: Semantic Search · Recommendation-Systems
Adjacent: Embeddings

🤖 LLM 활용

언제: factual grounding, multi-hop reasoning, entity disambiguation 필요 시. 언제 X: pure semantic similarity 만 필요할 때 — vector DB 가 더 simple.

❌ 안티패턴

Schema explosion: 매 entity 마다 new relation 정의 → unmanageable.
Stale KG: 자동 update pipeline 없이 manual curation → 6 months 지나면 obsolete.
No entity resolution: "Anthropic" vs "anthropic Inc." vs "ANTHROPIC" → duplicate nodes.
Triple-only thinking: property graph 의 edge attribute 무시.

🧪 검증 / 중복

Verified (Bollacker 2008 Freebase, Hogan 2021 KG survey ACM CSUR).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — KG fundamentals, triples, GraphRAG, Neo4j patterns

🛠️ 적용 사례 (Applied in summary)

🔎 코드베이스 근거 (자동 추출 — E:\Wiki 레포)

실제 구현/사용 위치:

connectai/src/features/projectChronicle/guardPrompt.ts:57 — [Omitted long matching line]

관련 커밋:

connectai d843364 feat: add premium matrix styling to knowledge graph, glowing nodes, and directional particle flow across synapses
connectai 279e671 feat: parse real workspace files for knowledge graph topology and add organic organic movement

자동 생성: code_grounding.mjs · 재실행 시 갱신됨

5.4 KiB Raw Blame History