Files
2nd/10_Wiki/Topics/AI_and_ML/Auto-Encoding.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

256 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-auto-encoding
title: Auto-Encoding
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [autoencoder, AE, VAE, denoising AE, masked autoencoder, MAE, latent space, bottleneck]
duplicate_of: none
source_trust_level: A
confidence_score: 0.93
verification_status: applied
tags: [autoencoder, vae, mae, dimensionality-reduction, anomaly-detection, generative, self-supervised, representation-learning]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: Python
framework: PyTorch / Diffusers / TensorFlow
---
# Auto-Encoding
## 📌 한 줄 통찰
> **"매 information diet + restore"**. 매 input → 매 bottleneck (latent) → 매 input 의 reconstruct. 매 unsupervised representation. 매 PCA 의 deep version. 매 modern generative (Stable Diffusion VAE) / self-supervised (MAE) 의 backbone.
## 📖 핵심
### 매 architecture
- **Encoder**: 매 high-dim → 매 low-dim latent.
- **Bottleneck**: 매 compressed representation.
- **Decoder**: 매 latent → 매 input reconstruct.
- 매 loss: 매 reconstruction error.
### 매 variant
#### Vanilla AE
- 매 deterministic encoder.
- 매 simple MSE.
- 매 representation OK 가, 매 generation 의 weak.
#### Denoising AE (Vincent 2008)
- 매 input + noise → 매 clean output.
- 매 robustness 향상.
#### Sparse AE
- 매 latent activation 의 sparsity penalty.
- 매 interpretable feature.
#### Variational AE (VAE, Kingma 2013)
- 매 encoder = 매 distribution (μ, σ).
- 매 reparameterization trick.
- 매 ELBO loss = reconstruction - KL(q || prior).
- 매 generation 의 enable.
#### β-VAE (Higgins 2017)
- 매 KL term 의 weight β.
- 매 disentanglement.
#### Vector Quantized VAE (VQ-VAE)
- 매 discrete latent (codebook).
- 매 DALL-E, 매 Stable Diffusion latent.
#### Masked Autoencoder (MAE, He 2021)
- 매 75% patch 의 mask.
- 매 reconstruct 만 의 self-supervised.
- 매 ViT 의 best pretraining.
#### Adversarial AE (AAE)
- 매 GAN 의 latent prior 의 enforce.
### 매 응용
1. **Dimensionality reduction**: 매 PCA 의 nonlinear.
2. **Denoising**: 매 image / audio cleanup.
3. **Anomaly detection**: 매 reconstruction error 의 high.
4. **Generative model**: VAE → image / molecule.
5. **Pretraining**: MAE → ViT downstream.
6. **Compression**: 매 neural codec.
7. **Recommender system**: 매 user / item embedding.
8. **Style transfer**: 매 latent manipulation.
### 매 bottleneck design
- **Linear**: 매 PCA-equivalent.
- **Nonlinear (deep)**: 매 manifold capture.
- **Discrete (VQ)**: 매 codebook.
- **Hierarchical** (NVAE, VQ-VAE-2): 매 multi-scale.
### 매 modern critical
- **Stable Diffusion**: 매 VAE 의 8× compress (HxWx3 → H/8 × W/8 × 4).
- **DALL-E 1**: 매 dVAE.
- **Whisper**: 매 mel encoder.
- **MAE**: 매 ViT-Huge 의 pretrain.
## 💻 패턴
### Vanilla AE (PyTorch)
```python
import torch.nn as nn
class AutoEncoder(nn.Module):
def __init__(self, input_dim=784, latent_dim=32):
super().__init__()
self.encoder = nn.Sequential(
nn.Linear(input_dim, 256), nn.ReLU(),
nn.Linear(256, 64), nn.ReLU(),
nn.Linear(64, latent_dim),
)
self.decoder = nn.Sequential(
nn.Linear(latent_dim, 64), nn.ReLU(),
nn.Linear(64, 256), nn.ReLU(),
nn.Linear(256, input_dim), nn.Sigmoid(),
)
def forward(self, x):
z = self.encoder(x)
return self.decoder(z), z
# Train
loss = ((x_recon - x)**2).mean()
```
### VAE
```python
class VAE(nn.Module):
def __init__(self, input_dim=784, latent_dim=32):
super().__init__()
self.enc = nn.Sequential(nn.Linear(input_dim, 256), nn.ReLU())
self.fc_mu = nn.Linear(256, latent_dim)
self.fc_logvar = nn.Linear(256, latent_dim)
self.dec = nn.Sequential(
nn.Linear(latent_dim, 256), nn.ReLU(),
nn.Linear(256, input_dim), nn.Sigmoid(),
)
def reparameterize(self, mu, logvar):
std = torch.exp(0.5 * logvar)
eps = torch.randn_like(std)
return mu + eps * std
def forward(self, x):
h = self.enc(x)
mu, logvar = self.fc_mu(h), self.fc_logvar(h)
z = self.reparameterize(mu, logvar)
return self.dec(z), mu, logvar
def vae_loss(x, x_recon, mu, logvar, beta=1.0):
recon = F.binary_cross_entropy(x_recon, x, reduction='sum')
kl = -0.5 * torch.sum(1 + logvar - mu**2 - logvar.exp())
return recon + beta * kl
```
### Denoising AE
```python
def train_denoising(model, x):
noise = torch.randn_like(x) * 0.3
x_noisy = x + noise
x_recon = model(x_noisy)
return ((x_recon - x)**2).mean()
```
### MAE (vision)
```python
# 매 He et al. 2021 의 simplified
def mae_forward(image, encoder, decoder, mask_ratio=0.75):
# 매 patch 의 split
patches = image_to_patches(image, patch_size=16)
# 매 75% mask
n_visible = int(len(patches) * (1 - mask_ratio))
visible_idx = torch.randperm(len(patches))[:n_visible]
visible = patches[visible_idx]
# 매 visible 만 의 encode
encoded = encoder(visible)
# 매 mask token 의 add
full = insert_mask_tokens(encoded, visible_idx, total=len(patches))
# 매 reconstruct
return decoder(full)
# 매 loss = 매 masked patch 만
loss = ((reconstructed[masked] - original[masked])**2).mean()
```
### Anomaly detection
```python
def detect_anomaly(model, x, threshold):
x_recon, _ = model(x)
error = ((x_recon - x)**2).mean(dim=tuple(range(1, x.dim())))
return error > threshold
# 매 normal data 만 train → 매 anomaly = 매 high reconstruction error
```
### Stable Diffusion VAE (latent)
```python
from diffusers import AutoencoderKL
vae = AutoencoderKL.from_pretrained('runwayml/stable-diffusion-v1-5', subfolder='vae')
# 매 image (512x512x3) → 매 latent (64x64x4) — 매 8× compress
latent = vae.encode(image).latent_dist.sample() * 0.18215
# 매 latent → 매 image
image_recon = vae.decode(latent / 0.18215).sample
```
### β-VAE (disentangle)
```python
# 매 β > 1 → 매 disentanglement ↑, 매 reconstruction ↓
loss = recon + beta * kl # 매 β = 4 ~ 10
```
## 🤔 결정 기준
| 응용 | Variant |
|---|---|
| Dimensionality reduce | Vanilla AE |
| Denoising | Denoising AE |
| Generation | VAE / VQ-VAE |
| Disentanglement | β-VAE |
| Self-supervised vision | MAE |
| Latent diffusion | VAE (continuous) / VQ-VAE (discrete) |
| Anomaly | Vanilla AE + reconstruction error |
| Compression | Neural codec (rate-distortion) |
**기본값**: Task-specific. 매 representation = AE. 매 generative = VAE. 매 vision pretrain = MAE.
## 🔗 Graph
- 부모: [[Generative-AI|Generative-Models]]
- 변형: [[VAE]] · [[β-VAE]] · [[MAE]] · [[Denoising-AE]]
- 응용: [[Anomaly-Detection]] · [[Stable-Diffusion]] · [[DALL-E]]
- Adjacent: [[PCA]] · [[Generative-Adversarial-Networks|GAN]] · [[Diffusion-Model]] · [[Latent-Space]]
## 🤖 LLM 활용
**언제**: 매 representation learning. 매 anomaly detection. 매 generative latent. 매 vision pretrain.
**언제 X**: 매 supervised learning 의 sufficient. 매 highly structured data (graph 의 GNN).
## ❌ 안티패턴
- **Identity map** (no bottleneck): 매 useless.
- **VAE 의 mode collapse**: 매 KL term 의 over-strong.
- **β-VAE 의 too high β**: 매 reconstruction 의 destroy.
- **MAE 의 low mask ratio**: 매 trivial.
- **Anomaly 의 train on mixed**: 매 anomaly 의 included.
- **Latent dim 의 too large**: 매 overfit.
## 🧪 검증 / 중복
- Verified (Hinton AE, Kingma VAE, He MAE, Stable Diffusion).
- 신뢰도 A.
- Related: [[VAE]] · [[MAE]] · [[Stable-Diffusion]] · [[Anomaly-Detection]] · [[Self-Supervised-Learning]].
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — variant + 매 PyTorch code (AE, VAE, MAE, anomaly, SD VAE) |