Files
2nd/10_Wiki/Topics/AI_and_ML/Auto-Encoding.md
T
2026-05-10 22:08:15 +09:00

256 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-auto-encoding
title: Auto-Encoding
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [autoencoder, AE, VAE, denoising AE, masked autoencoder, MAE, latent space, bottleneck]
duplicate_of: none
source_trust_level: A
confidence_score: 0.93
verification_status: applied
tags: [autoencoder, vae, mae, dimensionality-reduction, anomaly-detection, generative, self-supervised, representation-learning]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: Python
framework: PyTorch / Diffusers / TensorFlow
---
# Auto-Encoding
## 📌 한 줄 통찰
> **"매 information diet + restore"**. 매 input → 매 bottleneck (latent) → 매 input 의 reconstruct. 매 unsupervised representation. 매 PCA 의 deep version. 매 modern generative (Stable Diffusion VAE) / self-supervised (MAE) 의 backbone.
## 📖 핵심
### 매 architecture
- **Encoder**: 매 high-dim → 매 low-dim latent.
- **Bottleneck**: 매 compressed representation.
- **Decoder**: 매 latent → 매 input reconstruct.
- 매 loss: 매 reconstruction error.
### 매 variant
#### Vanilla AE
- 매 deterministic encoder.
- 매 simple MSE.
- 매 representation OK 가, 매 generation 의 weak.
#### Denoising AE (Vincent 2008)
- 매 input + noise → 매 clean output.
- 매 robustness 향상.
#### Sparse AE
- 매 latent activation 의 sparsity penalty.
- 매 interpretable feature.
#### Variational AE (VAE, Kingma 2013)
- 매 encoder = 매 distribution (μ, σ).
- 매 reparameterization trick.
- 매 ELBO loss = reconstruction - KL(q || prior).
- 매 generation 의 enable.
#### β-VAE (Higgins 2017)
- 매 KL term 의 weight β.
- 매 disentanglement.
#### Vector Quantized VAE (VQ-VAE)
- 매 discrete latent (codebook).
- 매 DALL-E, 매 Stable Diffusion latent.
#### Masked Autoencoder (MAE, He 2021)
- 매 75% patch 의 mask.
- 매 reconstruct 만 의 self-supervised.
- 매 ViT 의 best pretraining.
#### Adversarial AE (AAE)
- 매 GAN 의 latent prior 의 enforce.
### 매 응용
1. **Dimensionality reduction**: 매 PCA 의 nonlinear.
2. **Denoising**: 매 image / audio cleanup.
3. **Anomaly detection**: 매 reconstruction error 의 high.
4. **Generative model**: VAE → image / molecule.
5. **Pretraining**: MAE → ViT downstream.
6. **Compression**: 매 neural codec.
7. **Recommender system**: 매 user / item embedding.
8. **Style transfer**: 매 latent manipulation.
### 매 bottleneck design
- **Linear**: 매 PCA-equivalent.
- **Nonlinear (deep)**: 매 manifold capture.
- **Discrete (VQ)**: 매 codebook.
- **Hierarchical** (NVAE, VQ-VAE-2): 매 multi-scale.
### 매 modern critical
- **Stable Diffusion**: 매 VAE 의 8× compress (HxWx3 → H/8 × W/8 × 4).
- **DALL-E 1**: 매 dVAE.
- **Whisper**: 매 mel encoder.
- **MAE**: 매 ViT-Huge 의 pretrain.
## 💻 패턴
### Vanilla AE (PyTorch)
```python
import torch.nn as nn
class AutoEncoder(nn.Module):
def __init__(self, input_dim=784, latent_dim=32):
super().__init__()
self.encoder = nn.Sequential(
nn.Linear(input_dim, 256), nn.ReLU(),
nn.Linear(256, 64), nn.ReLU(),
nn.Linear(64, latent_dim),
)
self.decoder = nn.Sequential(
nn.Linear(latent_dim, 64), nn.ReLU(),
nn.Linear(64, 256), nn.ReLU(),
nn.Linear(256, input_dim), nn.Sigmoid(),
)
def forward(self, x):
z = self.encoder(x)
return self.decoder(z), z
# Train
loss = ((x_recon - x)**2).mean()
```
### VAE
```python
class VAE(nn.Module):
def __init__(self, input_dim=784, latent_dim=32):
super().__init__()
self.enc = nn.Sequential(nn.Linear(input_dim, 256), nn.ReLU())
self.fc_mu = nn.Linear(256, latent_dim)
self.fc_logvar = nn.Linear(256, latent_dim)
self.dec = nn.Sequential(
nn.Linear(latent_dim, 256), nn.ReLU(),
nn.Linear(256, input_dim), nn.Sigmoid(),
)
def reparameterize(self, mu, logvar):
std = torch.exp(0.5 * logvar)
eps = torch.randn_like(std)
return mu + eps * std
def forward(self, x):
h = self.enc(x)
mu, logvar = self.fc_mu(h), self.fc_logvar(h)
z = self.reparameterize(mu, logvar)
return self.dec(z), mu, logvar
def vae_loss(x, x_recon, mu, logvar, beta=1.0):
recon = F.binary_cross_entropy(x_recon, x, reduction='sum')
kl = -0.5 * torch.sum(1 + logvar - mu**2 - logvar.exp())
return recon + beta * kl
```
### Denoising AE
```python
def train_denoising(model, x):
noise = torch.randn_like(x) * 0.3
x_noisy = x + noise
x_recon = model(x_noisy)
return ((x_recon - x)**2).mean()
```
### MAE (vision)
```python
# 매 He et al. 2021 의 simplified
def mae_forward(image, encoder, decoder, mask_ratio=0.75):
# 매 patch 의 split
patches = image_to_patches(image, patch_size=16)
# 매 75% mask
n_visible = int(len(patches) * (1 - mask_ratio))
visible_idx = torch.randperm(len(patches))[:n_visible]
visible = patches[visible_idx]
# 매 visible 만 의 encode
encoded = encoder(visible)
# 매 mask token 의 add
full = insert_mask_tokens(encoded, visible_idx, total=len(patches))
# 매 reconstruct
return decoder(full)
# 매 loss = 매 masked patch 만
loss = ((reconstructed[masked] - original[masked])**2).mean()
```
### Anomaly detection
```python
def detect_anomaly(model, x, threshold):
x_recon, _ = model(x)
error = ((x_recon - x)**2).mean(dim=tuple(range(1, x.dim())))
return error > threshold
# 매 normal data 만 train → 매 anomaly = 매 high reconstruction error
```
### Stable Diffusion VAE (latent)
```python
from diffusers import AutoencoderKL
vae = AutoencoderKL.from_pretrained('runwayml/stable-diffusion-v1-5', subfolder='vae')
# 매 image (512x512x3) → 매 latent (64x64x4) — 매 8× compress
latent = vae.encode(image).latent_dist.sample() * 0.18215
# 매 latent → 매 image
image_recon = vae.decode(latent / 0.18215).sample
```
### β-VAE (disentangle)
```python
# 매 β > 1 → 매 disentanglement ↑, 매 reconstruction ↓
loss = recon + beta * kl # 매 β = 4 ~ 10
```
## 🤔 결정 기준
| 응용 | Variant |
|---|---|
| Dimensionality reduce | Vanilla AE |
| Denoising | Denoising AE |
| Generation | VAE / VQ-VAE |
| Disentanglement | β-VAE |
| Self-supervised vision | MAE |
| Latent diffusion | VAE (continuous) / VQ-VAE (discrete) |
| Anomaly | Vanilla AE + reconstruction error |
| Compression | Neural codec (rate-distortion) |
**기본값**: Task-specific. 매 representation = AE. 매 generative = VAE. 매 vision pretrain = MAE.
## 🔗 Graph
- 부모: [[Unsupervised-Learning]] · [[Representation-Learning]] · [[Generative-Models]]
- 변형: [[VAE]] · [[VQ-VAE]] · [[β-VAE]] · [[MAE]] · [[Denoising-AE]] · [[Sparse-AE]]
- 응용: [[Anomaly-Detection]] · [[Stable-Diffusion]] · [[DALL-E]] · [[Self-Supervised-Learning]]
- Adjacent: [[PCA]] · [[GAN]] · [[Diffusion-Model]] · [[Latent-Space]]
## 🤖 LLM 활용
**언제**: 매 representation learning. 매 anomaly detection. 매 generative latent. 매 vision pretrain.
**언제 X**: 매 supervised learning 의 sufficient. 매 highly structured data (graph 의 GNN).
## ❌ 안티패턴
- **Identity map** (no bottleneck): 매 useless.
- **VAE 의 mode collapse**: 매 KL term 의 over-strong.
- **β-VAE 의 too high β**: 매 reconstruction 의 destroy.
- **MAE 의 low mask ratio**: 매 trivial.
- **Anomaly 의 train on mixed**: 매 anomaly 의 included.
- **Latent dim 의 too large**: 매 overfit.
## 🧪 검증 / 중복
- Verified (Hinton AE, Kingma VAE, He MAE, Stable Diffusion).
- 신뢰도 A.
- Related: [[VAE]] · [[MAE]] · [[Stable-Diffusion]] · [[Anomaly-Detection]] · [[Self-Supervised-Learning]].
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — variant + 매 PyTorch code (AE, VAE, MAE, anomaly, SD VAE) |