--- id: wiki-2026-0508-auto-encoding title: Auto-Encoding category: 10_Wiki/Topics status: verified canonical_id: self aliases: [autoencoder, AE, VAE, denoising AE, masked autoencoder, MAE, latent space, bottleneck] duplicate_of: none source_trust_level: A confidence_score: 0.93 verification_status: applied tags: [autoencoder, vae, mae, dimensionality-reduction, anomaly-detection, generative, self-supervised, representation-learning] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: PyTorch / Diffusers / TensorFlow --- # Auto-Encoding ## 📌 한 줄 통찰 > **"매 information diet + restore"**. 매 input → 매 bottleneck (latent) → 매 input 의 reconstruct. 매 unsupervised representation. 매 PCA 의 deep version. 매 modern generative (Stable Diffusion VAE) / self-supervised (MAE) 의 backbone. ## 📖 핵심 ### 매 architecture - **Encoder**: 매 high-dim → 매 low-dim latent. - **Bottleneck**: 매 compressed representation. - **Decoder**: 매 latent → 매 input reconstruct. - 매 loss: 매 reconstruction error. ### 매 variant #### Vanilla AE - 매 deterministic encoder. - 매 simple MSE. - 매 representation OK 가, 매 generation 의 weak. #### Denoising AE (Vincent 2008) - 매 input + noise → 매 clean output. - 매 robustness 향상. #### Sparse AE - 매 latent activation 의 sparsity penalty. - 매 interpretable feature. #### Variational AE (VAE, Kingma 2013) - 매 encoder = 매 distribution (μ, σ). - 매 reparameterization trick. - 매 ELBO loss = reconstruction - KL(q || prior). - 매 generation 의 enable. #### β-VAE (Higgins 2017) - 매 KL term 의 weight β. - 매 disentanglement. #### Vector Quantized VAE (VQ-VAE) - 매 discrete latent (codebook). - 매 DALL-E, 매 Stable Diffusion latent. #### Masked Autoencoder (MAE, He 2021) - 매 75% patch 의 mask. - 매 reconstruct 만 의 self-supervised. - 매 ViT 의 best pretraining. #### Adversarial AE (AAE) - 매 GAN 의 latent prior 의 enforce. ### 매 응용 1. **Dimensionality reduction**: 매 PCA 의 nonlinear. 2. **Denoising**: 매 image / audio cleanup. 3. **Anomaly detection**: 매 reconstruction error 의 high. 4. **Generative model**: VAE → image / molecule. 5. **Pretraining**: MAE → ViT downstream. 6. **Compression**: 매 neural codec. 7. **Recommender system**: 매 user / item embedding. 8. **Style transfer**: 매 latent manipulation. ### 매 bottleneck design - **Linear**: 매 PCA-equivalent. - **Nonlinear (deep)**: 매 manifold capture. - **Discrete (VQ)**: 매 codebook. - **Hierarchical** (NVAE, VQ-VAE-2): 매 multi-scale. ### 매 modern critical - **Stable Diffusion**: 매 VAE 의 8× compress (HxWx3 → H/8 × W/8 × 4). - **DALL-E 1**: 매 dVAE. - **Whisper**: 매 mel encoder. - **MAE**: 매 ViT-Huge 의 pretrain. ## 💻 패턴 ### Vanilla AE (PyTorch) ```python import torch.nn as nn class AutoEncoder(nn.Module): def __init__(self, input_dim=784, latent_dim=32): super().__init__() self.encoder = nn.Sequential( nn.Linear(input_dim, 256), nn.ReLU(), nn.Linear(256, 64), nn.ReLU(), nn.Linear(64, latent_dim), ) self.decoder = nn.Sequential( nn.Linear(latent_dim, 64), nn.ReLU(), nn.Linear(64, 256), nn.ReLU(), nn.Linear(256, input_dim), nn.Sigmoid(), ) def forward(self, x): z = self.encoder(x) return self.decoder(z), z # Train loss = ((x_recon - x)**2).mean() ``` ### VAE ```python class VAE(nn.Module): def __init__(self, input_dim=784, latent_dim=32): super().__init__() self.enc = nn.Sequential(nn.Linear(input_dim, 256), nn.ReLU()) self.fc_mu = nn.Linear(256, latent_dim) self.fc_logvar = nn.Linear(256, latent_dim) self.dec = nn.Sequential( nn.Linear(latent_dim, 256), nn.ReLU(), nn.Linear(256, input_dim), nn.Sigmoid(), ) def reparameterize(self, mu, logvar): std = torch.exp(0.5 * logvar) eps = torch.randn_like(std) return mu + eps * std def forward(self, x): h = self.enc(x) mu, logvar = self.fc_mu(h), self.fc_logvar(h) z = self.reparameterize(mu, logvar) return self.dec(z), mu, logvar def vae_loss(x, x_recon, mu, logvar, beta=1.0): recon = F.binary_cross_entropy(x_recon, x, reduction='sum') kl = -0.5 * torch.sum(1 + logvar - mu**2 - logvar.exp()) return recon + beta * kl ``` ### Denoising AE ```python def train_denoising(model, x): noise = torch.randn_like(x) * 0.3 x_noisy = x + noise x_recon = model(x_noisy) return ((x_recon - x)**2).mean() ``` ### MAE (vision) ```python # 매 He et al. 2021 의 simplified def mae_forward(image, encoder, decoder, mask_ratio=0.75): # 매 patch 의 split patches = image_to_patches(image, patch_size=16) # 매 75% mask n_visible = int(len(patches) * (1 - mask_ratio)) visible_idx = torch.randperm(len(patches))[:n_visible] visible = patches[visible_idx] # 매 visible 만 의 encode encoded = encoder(visible) # 매 mask token 의 add full = insert_mask_tokens(encoded, visible_idx, total=len(patches)) # 매 reconstruct return decoder(full) # 매 loss = 매 masked patch 만 loss = ((reconstructed[masked] - original[masked])**2).mean() ``` ### Anomaly detection ```python def detect_anomaly(model, x, threshold): x_recon, _ = model(x) error = ((x_recon - x)**2).mean(dim=tuple(range(1, x.dim()))) return error > threshold # 매 normal data 만 train → 매 anomaly = 매 high reconstruction error ``` ### Stable Diffusion VAE (latent) ```python from diffusers import AutoencoderKL vae = AutoencoderKL.from_pretrained('runwayml/stable-diffusion-v1-5', subfolder='vae') # 매 image (512x512x3) → 매 latent (64x64x4) — 매 8× compress latent = vae.encode(image).latent_dist.sample() * 0.18215 # 매 latent → 매 image image_recon = vae.decode(latent / 0.18215).sample ``` ### β-VAE (disentangle) ```python # 매 β > 1 → 매 disentanglement ↑, 매 reconstruction ↓ loss = recon + beta * kl # 매 β = 4 ~ 10 ``` ## 🤔 결정 기준 | 응용 | Variant | |---|---| | Dimensionality reduce | Vanilla AE | | Denoising | Denoising AE | | Generation | VAE / VQ-VAE | | Disentanglement | β-VAE | | Self-supervised vision | MAE | | Latent diffusion | VAE (continuous) / VQ-VAE (discrete) | | Anomaly | Vanilla AE + reconstruction error | | Compression | Neural codec (rate-distortion) | **기본값**: Task-specific. 매 representation = AE. 매 generative = VAE. 매 vision pretrain = MAE. ## 🔗 Graph - 부모: [[Generative-AI|Generative-Models]] - 변형: [[VAE]] · [[β-VAE]] · [[MAE]] · [[Denoising-AE]] - 응용: [[Anomaly-Detection]] · [[Stable-Diffusion]] · [[DALL-E]] - Adjacent: [[PCA]] · [[Generative-Adversarial-Networks|GAN]] · [[Diffusion-Model]] · [[Latent-Space]] ## 🤖 LLM 활용 **언제**: 매 representation learning. 매 anomaly detection. 매 generative latent. 매 vision pretrain. **언제 X**: 매 supervised learning 의 sufficient. 매 highly structured data (graph 의 GNN). ## ❌ 안티패턴 - **Identity map** (no bottleneck): 매 useless. - **VAE 의 mode collapse**: 매 KL term 의 over-strong. - **β-VAE 의 too high β**: 매 reconstruction 의 destroy. - **MAE 의 low mask ratio**: 매 trivial. - **Anomaly 의 train on mixed**: 매 anomaly 의 included. - **Latent dim 의 too large**: 매 overfit. ## 🧪 검증 / 중복 - Verified (Hinton AE, Kingma VAE, He MAE, Stable Diffusion). - 신뢰도 A. - Related: [[VAE]] · [[MAE]] · [[Stable-Diffusion]] · [[Anomaly-Detection]] · [[Self-Supervised-Learning]]. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — variant + 매 PyTorch code (AE, VAE, MAE, anomaly, SD VAE) |