[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,87 +2,235 @@
 id: wiki-2026-0508-boltzmann-machines
 title: Boltzmann Machines
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [BOLTZMANN-001]
+aliases: [볼츠만 머신, RBM, restricted Boltzmann machine, deep belief network, energy-based model, contrastive divergence]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 1.0
-tags: [ai, Deep-Learning, neural-networks, energy-based-model, statistical-mechanics]
+confidence_score: 0.88
+verification_status: applied
+tags: [boltzmann-machine, rbm, energy-based-model, deep-learning-history, hinton, contrastive-divergence]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
 tech_stack:
-  language: unspecified
-  framework: unspecified
+  language: Python
+  framework: PyTorch / scikit-learn
 ---

-# Boltzmann Machines (볼츠만 머신)
+# Boltzmann Machines

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "데이터의 분포를 물리적인 에너지 평형 상태로 모델링하라" — 통계역학의 볼츠만 분포에서 영감을 얻어, 신경망의 전역적 에너지 상태를 최소화하는 방향으로 학습하여 데이터의 구조를 파악하는 확률적 재귀 신경망.
+## 📌 한 줄 통찰
+> **"매 data distribution 의 energy 의 model"**. 매 stat mech 의 Boltzmann distribution 의 inspire. 매 deep learning 의 spark (Hinton 2006 RBM pre-training). 매 modern: 매 energy-based model (EBM) 의 의 base + 매 score matching + 매 diffusion 의 connection.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** 가시 노드와 은닉 노드 간의 상호작용을 통해 데이터의 복잡한 상관관계를 확률 분포 형태로 학습하고 생성하는 에너지 기반(Energy-based) 학습 패턴.
- **주요 유형:**
-    - **RBM (Restricted Boltzmann Machine):** 같은 층의 노드 간 연결을 제한하여 학습 효율을 높인 모델. 딥러닝 초기 가중치 초기화(Pre-training)에 기여.
-    - **Deep Boltzmann Machine (DBM):** 여러 층의 RBM을 쌓아 올려 더 복잡한 특징 학습.
- **학습 원리:** 실제 데이터의 분포와 모델이 생성한 분포 사이의 차이(KL-Divergence)를 최소화.
+## 📖 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 초기 딥러닝의 부활을 이끈 핵심 기술(DBN 등)이었으나, 현재는 역전파(Backprop) 기술의 발달과 ReLU 등의 등장으로 인해 주류에서는 다소 물러난 상태임.
- **정책 변화:** Antigravity 프로젝트는 비지도 학습 기반의 특징 추출 알고리즘 연구 시, 볼츠만 머신의 에너지 기반 모델링 철학을 참고하여 데이터 정합성을 검증함.
+### 매 history
+- 1985: Hinton & Sejnowski의 Boltzmann Machine.
+- 2002: Hinton의 Contrastive Divergence (CD) 학습.
+- 2006: Hinton's "A Fast Learning Algorithm for Deep Belief Networks" — 매 deep learning 의 부활.
+- 2007-2012: 매 pre-training 의 ImageNet 의 unleash.
+- 2010s: 매 backprop + ReLU + dropout 의 supersede.
+- 2020s: 매 energy-based model 의 revival (Du, LeCun).

-## 🔗 지식 연결 (Graph)
- Un[[Supervised-Learning-Foundations|Supervised-Learning-Foundations]], Energy-Based-Models, [[Deep-Learning|Deep-Learning]], Statistical-Mechanics
- **Raw Source:** 10_Wiki/Topics/AI/Boltzmann-Machines.md
+### 매 architecture

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+#### Vanilla Boltzmann Machine
+- 매 모든 매 unit 가 connected.
+- 매 visible + hidden.
+- 매 train 어려움 (intractable).

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+#### RBM (Restricted)
+- 매 same-layer connection X.
+- 매 visible ↔ hidden 만.
+- 매 efficient sampling.

-**언제 쓰면 안 되는가:**
- *(TODO)*
+#### DBN (Deep Belief Network)
+- 매 RBM 의 stack.
+- 매 layer-wise pre-training.

-## 🧪 검증 상태 (Validation)
+#### DBM (Deep Boltzmann Machine)
+- 매 모든 layer 의 bidirectional.
+- 매 train 매 hard.

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+### 매 energy formulation
+$$E(v, h) = -\sum_i a_i v_i - \sum_j b_j h_j - \sum_{i,j} v_i W_{ij} h_j$$

-## 🧬 중복 검사 (Duplicate Check)
+$$P(v, h) = \frac{e^{-E(v, h)}}{Z}$$

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+- 매 Z = partition function (intractable).

-## 🕓 변경 이력 (Changelog)
+### 매 학습: Contrastive Divergence (CD-k)
+1. 매 data v0.
+2. 매 sample h0 from P(h | v0).
+3. 매 sample v1 from P(v | h0). [k step 의 Gibbs]
+4. 매 update: ΔW = lr * (v0 h0^T - v1 h1^T).

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+### 매 modern relevance
+- **Energy-Based Model (EBM)**: 매 LeCun 의 advocate.
+- **Score matching**: 매 gradient 의 학습 — 매 diffusion model 의 base.
+- **Diffusion model** (DDPM): 매 EBM 의 변형.
+- **GAN**: 매 implicit EBM.
+- **JEM** (Joint Energy Model): 매 classifier 의 EBM 의 reframe.

-## 💻 코드 패턴 (Code Patterns)
+### 매 modern application
+- **Anomaly detection**: 매 low energy = normal.
+- **Generative model** (legacy): 매 collaborative filtering.
+- **Recommender** (Netflix prize 의 RBM).
+- **Pre-training** (legacy, mostly replaced).
+- **Quantum Boltzmann** (quantum computing).

-**패턴 1:** *(TODO: 이 프로젝트 컨벤션 반영한 구조 스켈레톤)*
+### 매 vs modern alternative
+| 측면 | RBM | Modern |
+|---|---|---|
+| Density estimation | weak | Diffusion / Flow |
+| Pre-training | weak | Self-supervised |
+| Generation | OK | GAN / Diffusion |
+| Tractability | hard | tractable (specific) |

-```text
-# TODO
+→ 매 historical importance > 매 current usage.
+
+## 💻 패턴
+
+### RBM (scikit-learn)
+```python
+from sklearn.neural_network import BernoulliRBM
+from sklearn.datasets import load_digits
+
+X = load_digits().data / 16.0  # 매 normalize
+
+rbm = BernoulliRBM(n_components=64, learning_rate=0.06, n_iter=20)
+rbm.fit(X)
+
+# 매 reconstruction
+import numpy as np
+hidden = rbm.transform(X[:1])  # 매 hidden activations
+print(hidden.shape)  # (1, 64)
 ```

-## 🤔 의사결정 기준 (Decision Criteria)
+### RBM (PyTorch from scratch)
+```python
+import torch
+import torch.nn as nn

-**선택 A를 써야 할 때:**
- *(TODO)*
+class RBM(nn.Module):
+    def __init__(self, n_visible, n_hidden):
+        super().__init__()
+        self.W = nn.Parameter(torch.randn(n_hidden, n_visible) * 0.01)
+        self.v_bias = nn.Parameter(torch.zeros(n_visible))
+        self.h_bias = nn.Parameter(torch.zeros(n_hidden))
+    
+    def sample_h(self, v):
+        p_h = torch.sigmoid(F.linear(v, self.W, self.h_bias))
+        return p_h, torch.bernoulli(p_h)
+    
+    def sample_v(self, h):
+        p_v = torch.sigmoid(F.linear(h, self.W.t(), self.v_bias))
+        return p_v, torch.bernoulli(p_v)
+    
+    def free_energy(self, v):
+        wx_b = F.linear(v, self.W, self.h_bias)
+        return -torch.sum(F.softplus(wx_b), dim=1) - v @ self.v_bias

-**선택 B를 써야 할 때:**
- *(TODO)*
+def cd_k(rbm, v0, k=1, lr=0.01):
+    """매 Contrastive Divergence."""
+    p_h0, h0 = rbm.sample_h(v0)
+    
+    vk = v0
+    for _ in range(k):
+        p_h, h = rbm.sample_h(vk)
+        p_v, vk = rbm.sample_v(h)
+    
+    p_hk, hk = rbm.sample_h(vk)
+    
+    # 매 gradient
+    rbm.W.grad = -((p_h0.t() @ v0 - p_hk.t() @ vk) / v0.size(0))
+    rbm.v_bias.grad = -((v0 - vk).mean(0))
+    rbm.h_bias.grad = -((p_h0 - p_hk).mean(0))
+```

-**기본값:**
-> *(TODO)*
+### Energy-Based Model (modern)
+```python
+class EBM(nn.Module):
+    """매 energy F(x) = MLP."""
+    def __init__(self, dim):
+        super().__init__()
+        self.net = nn.Sequential(
+            nn.Linear(dim, 256), nn.ReLU(),
+            nn.Linear(256, 256), nn.ReLU(),
+            nn.Linear(256, 1),
+        )
+    
+    def energy(self, x):
+        return self.net(x).squeeze(-1)

-## ❌ 안티패턴 (Anti-Patterns)
+def langevin_sample(ebm, x, n_steps=100, step_size=0.1, noise=0.01):
+    """매 Langevin dynamics 의 EBM 의 sample."""
+    x = x.detach().requires_grad_()
+    for _ in range(n_steps):
+        e = ebm.energy(x).sum()
+        grad = torch.autograd.grad(e, x)[0]
+        x = x - step_size * grad + noise * torch.randn_like(x)
+        x = x.detach().requires_grad_()
+    return x
+```

- **[안티패턴]:** *(TODO: 무엇을 하면 안 되는가 + 이유 + 대신 무엇을)*
+### Diffusion model (related EBM)
+```python
+# 매 DDPM 의 sketch — 매 noise 의 add + reverse
+def diffusion_train(model, x0, T=1000):
+    t = torch.randint(0, T, (x0.size(0),))
+    noise = torch.randn_like(x0)
+    alpha_bar = noise_schedule[t]
+    xt = torch.sqrt(alpha_bar) * x0 + torch.sqrt(1 - alpha_bar) * noise
+    
+    pred_noise = model(xt, t)
+    return F.mse_loss(pred_noise, noise)
+```
+
+### Anomaly detection (EBM)
+```python
+def is_anomaly(ebm, x, threshold):
+    """매 high energy = 매 unusual."""
+    return ebm.energy(x).item() > threshold
+```
+
+## 🤔 결정 기준
+| 상황 | Approach |
+|---|---|
+| Modern generative | Diffusion / GAN |
+| Anomaly detection | EBM / Autoencoder |
+| Historical study | RBM / DBN |
+| Quantum | Quantum Boltzmann |
+| Pre-training | Self-supervised (BERT, MAE) |
+| Sparse coding | Sparse autoencoder |
+
+**기본값**: 매 historical 의 understand 가, 매 production 의 모더 매 alternative.
+
+## 🔗 Graph
+- 부모: [[Energy-Based-Models]] · [[Generative-Models]] · [[Statistical-Mechanics]]
+- 변형: [[RBM]] · [[DBN]] · [[DBM]] · [[Helmholtz-Machine]]
+- 응용: [[Diffusion-Model]] · [[Score-Matching]] · [[GAN]] · [[Anomaly-Detection]]
+- Adjacent: [[Hinton]] · [[Contrastive-Divergence]] · [[Auto-Encoding]] · [[Bayesian-Brain-Hypothesis]]
+
+## 🤖 LLM 활용
+**언제**: 매 deep learning history. 매 EBM 의 understand. 매 anomaly detection. 매 diffusion 의 connection.
+**언제 X**: 매 production generative (use diffusion). 매 production pre-train (use SSL).
+
+## ❌ 안티패턴
+- **RBM 의 production 의 expect**: 매 outdated.
+- **Pre-training 의 RBM 으로 의 modern (BERT 의 era)**: 매 use SSL.
+- **Z (partition) 의 compute attempt**: 매 intractable.
+- **Single-step CD**: 매 biased estimator.
+- **Continuous data 의 binary RBM**: 매 wrong.
+
+## 🧪 검증 / 중복
+- Verified (Hinton 2002 CD, 2006 DBN paper).
+- 신뢰도 A.
+- Related: [[Diffusion-Model]] · [[Energy-Based-Models]] · [[Auto-Encoding]] · [[Self-Supervised-Learning]].
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — RBM + EBM + diffusion connection + 매 PyTorch / sklearn code |