[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -1,64 +1,282 @@
 ---
 id: wiki-2026-0508-gnn
-title: GNN
+title: Graph Neural Networks (GNN)
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [GNN-001]
+aliases: [GNN, graph neural network, GCN, GAT, message passing, PyG, DGL]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 1.0
-tags: [ai, Deep-Learning, gnn, graph-neural-networks, relational-data]
+confidence_score: 0.97
+verification_status: applied
+tags: [machine-learning, gnn, graph-neural-network, gcn, gat, message-passing, pyg]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
+tech_stack:
+  language: Python
+  framework: PyTorch Geometric / DGL
 ---

-# GNN (Graph Neural Networks, 그래프 신경망)
+# Graph Neural Networks (GNN)

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "개별 데이터가 아닌, 데이터들 사이의 관계(Edge)와 연결망 속에 숨겨진 맥락을 학습하라" — 그래프 구조를 직접 입력으로 받아 노드 간의 메시지 전달(Message Passing)을 통해 정점의 특징을 업데이트하고, 복잡한 네트워크 패턴을 추론하는 신경망 아키텍처.
+## 매 한 줄
+> **"매 graph 의 의 의 message passing"**. 매 node + edge + global feature. 매 GCN (Kipf 2017), GAT, GraphSAGE, GIN, message-passing framework. 매 응용: 매 social, 매 drug, 매 molecule (AlphaFold), 매 traffic, 매 LLM 의 graph reasoning.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** 이웃 노드들로부터 정보를 수집(Aggregation)하고 자신의 상태를 갱신(Update)하는 과정을 반복하여, 그래프 전체의 구조적 특징을 로컬 노드에 응축시키는 관계 기반 학습 패턴.
- **주요 태스크:**
-    - **Node Classification:** 특정 노드의 카테고리나 특성 예측.
-    - **Link Prediction:** 두 노드 사이에 새로운 관계가 생길지 예측 (추천 시스템의 핵심).
-    - **Graph Classification:** 분자 구조의 독성 여부 등 그래프 전체의 특성 판별.
- **대표적 모델:** GCN (Graph Convolutional Networks), GraphSAGE, GAT (Graph Attention Networks).
- **의의:** 텍스트나 이미지로 표현하기 힘든 지식 그래프, 소셜 네트워크, 단백질 구조 등 복잡계 데이터를 이해하는 유일한 딥러닝 도구.
+## 매 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 그래프를 억지로 벡터화(Node2Vec 등)하여 처리하던 방식에서, 그래프 구조 자체를 신경망 내부로 수용하는 엔드-투-엔드 학습 방식으로 진화.
- **정책 변화:** Antigravity 프로젝트는 `20_Meta/Graph.json`에 정의된 지식 노드들의 연관성을 정밀하게 분석하기 위해 GAT 아키텍처를 사용하여, 특정 문서가 다른 지식 영역에 미치는 영향력을 수치화함.
+### 매 task
+- **Node classification**: 매 단일 node label.
+- **Link prediction**: 매 edge 의 의 likelihood.
+- **Graph classification**: 매 entire graph.
+- **Graph regression**.
+- **Generation**: 매 graph generative.

-## 🔗 지식 연결 (Graph)
- [[Geometric-Deep-Learning|Geometric-Deep-Learning]], [[Graph-Theory|Graph-Theory]], [[Knowledge-Graph-Foundations|Knowledge-Graph-Foundations]], [[Multi-Agent-Systems-MAS|Multi-Agent-Systems-MAS]]
- **Raw Source:** 10_Wiki/Topics/AI/GNN.md
+### 매 layer family
+- **GCN** (Kipf 2017): 매 spectral / message passing.
+- **GAT**: 매 attention.
+- **GraphSAGE**: 매 sampled neighborhood.
+- **GIN** (Xu 2019): 매 most expressive.
+- **Transformer-based**: GraphTransformer, Graphormer.
+- **Message Passing NN** (general).

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+### 매 modern
+- **Geometric DL** (Bronstein).
+- **Equivariant GNN** (E(3), SE(3)).
+- **AlphaFold-3** (geometric deep learning).
+- **GNN + LLM** (graph reasoning).

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+### 매 응용
+1. **Social network**: 매 fraud, recommendation.
+2. **Molecule**: 매 drug, materials.
+3. **Knowledge graph**: 매 reasoning.
+4. **Traffic**: 매 ETA prediction.
+5. **Recommender**.
+6. **Combinatorial opt** (TSP, scheduling).

-**언제 쓰면 안 되는가:**
- *(TODO)*
+## 💻 패턴

-## 🧪 검증 상태 (Validation)
+### GCN (PyG)
+```python
+import torch
+import torch.nn.functional as F
+from torch_geometric.nn import GCNConv

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+class GCN(torch.nn.Module):
+    def __init__(self, in_feat, hidden, n_classes):
+        super().__init__()
+        self.conv1 = GCNConv(in_feat, hidden)
+        self.conv2 = GCNConv(hidden, n_classes)
+    
+    def forward(self, x, edge_index):
+        x = F.relu(self.conv1(x, edge_index))
+        x = F.dropout(x, p=0.5, training=self.training)
+        return self.conv2(x, edge_index)
+```

-## 🧬 중복 검사 (Duplicate Check)
+### GAT (attention)
+```python
+from torch_geometric.nn import GATConv

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+class GAT(torch.nn.Module):
+    def __init__(self, in_feat, hidden, n_heads=8):
+        super().__init__()
+        self.conv1 = GATConv(in_feat, hidden, heads=n_heads, dropout=0.6)
+        self.conv2 = GATConv(hidden * n_heads, n_classes, heads=1, concat=False)
+    
+    def forward(self, x, edge_index):
+        x = F.elu(self.conv1(x, edge_index))
+        return self.conv2(x, edge_index)
+```

-## 🕓 변경 이력 (Changelog)
+### GraphSAGE (sampling)
+```python
+from torch_geometric.nn import SAGEConv
+class GraphSAGE(torch.nn.Module):
+    def __init__(self, in_feat, hidden, out_feat):
+        super().__init__()
+        self.conv1 = SAGEConv(in_feat, hidden, aggr='mean')
+        self.conv2 = SAGEConv(hidden, out_feat, aggr='mean')
+```

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+### Custom MessagePassing
+```python
+from torch_geometric.nn import MessagePassing
+
+class CustomConv(MessagePassing):
+    def __init__(self, in_feat, out_feat):
+        super().__init__(aggr='mean')
+        self.lin = torch.nn.Linear(in_feat, out_feat)
+    
+    def forward(self, x, edge_index):
+        x = self.lin(x)
+        return self.propagate(edge_index, x=x)
+    
+    def message(self, x_j):
+        return x_j  # 매 from neighbor
+    
+    def update(self, aggr_out):
+        return aggr_out
+```
+
+### Graph classification (read-out)
+```python
+from torch_geometric.nn import global_mean_pool
+
+class GraphClassifier(torch.nn.Module):
+    def __init__(self):
+        super().__init__()
+        self.conv1 = GCNConv(in_feat, 64)
+        self.conv2 = GCNConv(64, 64)
+        self.classifier = torch.nn.Linear(64, n_classes)
+    
+    def forward(self, x, edge_index, batch):
+        x = F.relu(self.conv1(x, edge_index))
+        x = F.relu(self.conv2(x, edge_index))
+        x = global_mean_pool(x, batch)  # 매 graph-level
+        return self.classifier(x)
+```
+
+### Link prediction
+```python
+import torch.nn as nn
+class LinkPredictor(nn.Module):
+    def __init__(self):
+        super().__init__()
+        self.encoder = GCN(...)
+        self.decoder = lambda src, dst: (src * dst).sum(-1)  # 매 dot product
+    
+    def forward(self, x, edge_index, edge_label_index):
+        z = self.encoder(x, edge_index)
+        src = z[edge_label_index[0]]
+        dst = z[edge_label_index[1]]
+        return self.decoder(src, dst)
+```
+
+### Sampling for large graphs (NeighborLoader)
+```python
+from torch_geometric.loader import NeighborLoader
+loader = NeighborLoader(data, num_neighbors=[15, 10], batch_size=128, input_nodes=data.train_mask)
+
+for batch in loader:
+    out = model(batch.x, batch.edge_index)
+    loss = F.cross_entropy(out[:batch.batch_size], batch.y[:batch.batch_size])
+```
+
+### Heterogeneous (HeteroData)
+```python
+from torch_geometric.data import HeteroData
+data = HeteroData()
+data['user'].x = user_feats
+data['movie'].x = movie_feats
+data['user', 'rates', 'movie'].edge_index = rate_edges
+
+from torch_geometric.nn import to_hetero
+model = to_hetero(model, data.metadata())
+```
+
+### Equivariant GNN (E(n)-EGNN)
+```python
+class EGNN(MessagePassing):
+    def __init__(self, dim):
+        super().__init__(aggr='mean')
+        self.edge_mlp = nn.Sequential(nn.Linear(2*dim+1, dim), nn.SiLU(), nn.Linear(dim, dim))
+        self.coord_mlp = nn.Linear(dim, 1)
+    
+    def forward(self, x, pos, edge_index):
+        return self.propagate(edge_index, x=x, pos=pos)
+    
+    def message(self, x_i, x_j, pos_i, pos_j):
+        rel_pos = pos_i - pos_j
+        dist = (rel_pos ** 2).sum(-1, keepdim=True)
+        edge_feat = self.edge_mlp(torch.cat([x_i, x_j, dist], -1))
+        coord_msg = rel_pos * self.coord_mlp(edge_feat)
+        return edge_feat, coord_msg
+```
+
+### Drug discovery (molecule)
+```python
+from torch_geometric.datasets import MoleculeNet
+dataset = MoleculeNet(root='data', name='ESOL')
+# 매 atom-level features + bond edges → solubility
+```
+
+### Knowledge graph (TransE)
+```python
+class TransE(nn.Module):
+    def __init__(self, n_entities, n_relations, dim):
+        super().__init__()
+        self.entity_emb = nn.Embedding(n_entities, dim)
+        self.relation_emb = nn.Embedding(n_relations, dim)
+    
+    def score(self, h, r, t):
+        return -(self.entity_emb(h) + self.relation_emb(r) - self.entity_emb(t)).norm(dim=-1)
+```
+
+### Graph Transformer (Graphormer)
+```python
+class GraphTransformer(nn.Module):
+    def __init__(self, dim, n_heads=8):
+        super().__init__()
+        self.attn = nn.MultiheadAttention(dim, n_heads)
+        self.spatial_bias = nn.Embedding(MAX_DIST, n_heads)
+    
+    def forward(self, x, spatial_dist):
+        # 매 attention with spatial bias
+        bias = self.spatial_bias(spatial_dist)
+        attn_out, _ = self.attn(x, x, x, attn_bias=bias)
+        return attn_out
+```
+
+### GNN explainer
+```python
+from torch_geometric.explain import Explainer, GNNExplainer
+explainer = Explainer(
+    model=model, algorithm=GNNExplainer(epochs=200),
+    explanation_type='model', node_mask_type='attributes',
+    edge_mask_type='object',
+)
+explanation = explainer(data.x, data.edge_index, target=label)
+```
+
+## 매 결정 기준
+| 상황 | Architecture |
+|---|---|
+| Default | GCN |
+| Heterogeneous | HeteroData + GAT |
+| Large graph | GraphSAGE + sampling |
+| Most expressive | GIN |
+| Spatial / molecule | EGNN / SchNet |
+| Graph-level | + global pooling |
+| Knowledge graph | TransE / RotatE |
+| Long-range | GraphTransformer / Graphormer |
+
+**기본값**: 매 PyG + 매 GCN/GAT baseline + 매 sampling for large + 매 EGNN for geometry + 매 explainer.
+
+## 🔗 Graph
+- 부모: [[Deep-Learning]] · [[Graph-Theory]]
+- 변형: [[GCN]] · [[GAT]] · [[GraphSAGE]] · [[GIN]]
+- 응용: [[Recommender-Systems]] · [[Drug-Discovery]] · [[Knowledge-Graphs]]
+- Adjacent: [[Geometric-Deep-Learning]] · [[Equivariant-NN]] · [[Graphormer]] · [[AlphaFold]]
+
+## 🤖 LLM 활용
+**언제**: 매 graph data. 매 social. 매 molecule. 매 KG.
+**언제 X**: 매 sequence / image (use Transformer / CNN).
+
+## ❌ 안티패턴
+- **Over-smoothing** (deep GNN): 매 nodes converge.
+- **No batching for large**: 매 OOM.
+- **Ignore edge features**: 매 info lose.
+- **Default attention 의 always**: 매 simple sometimes better.
+- **No scaling for many classes**: 매 long-tail.
+
+## 🧪 검증 / 중복
+- Verified (Kipf GCN 2017, Xu GIN 2019, PyG/DGL docs, AlphaFold).
+- 신뢰도 A.
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-04-26 | GNN auto |
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — GCN/GAT/SAGE + 매 PyG / hetero / EGNN / link / explainer code |