--- id: wiki-2026-0508-generative-adversarial-networks title: Generative Adversarial Networks (GAN) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [GAN, generative adversarial networks, StyleGAN, CycleGAN, Goodfellow, Wasserstein GAN] duplicate_of: none source_trust_level: A confidence_score: 0.97 verification_status: applied tags: [deep-learning, gan, generative, goodfellow, stylegan, image-generation] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: PyTorch / TensorFlow --- # Generative Adversarial Networks (GAN) ## 매 한 줄 > **"매 generator 의 의 의 fake 의 의 의, discriminator 의 의 의 detect — 매 minimax game"**. Goodfellow 2014. 매 image gen 의 dominant (2014-2022) → 매 diffusion 의 의 의 displace. 매 modern: 매 StyleGAN3, 매 CycleGAN, 매 GAN-inversion. ## 매 핵심 ### 매 model - **Generator G**: 매 noise → fake. - **Discriminator D**: 매 real or fake. - **Loss**: G 의 의 fool D, D 의 의 catch. ### 매 famous variants - **DCGAN** (Radford 2015): 매 conv-based. - **WGAN** (Arjovsky 2017): 매 Wasserstein distance. - **WGAN-GP**: 매 gradient penalty. - **StyleGAN** v1/v2/v3 (Karras): 매 face quality. - **CycleGAN**: 매 unpaired image translation. - **Pix2Pix**: 매 paired translation. - **BigGAN**: 매 class-conditional large. - **GAN inversion**: 매 image → latent. ### 매 modern context (2024+) - **Diffusion** dominate text-to-image. - **GAN niche**: 매 fast inference, 매 specific style. - **GAN inversion** for editing. - **StyleGAN** still SOTA for faces. ### 매 응용 1. **Image gen** (faces). 2. **Style transfer**. 3. **Super-resolution** (ESRGAN). 4. **Image-to-image** (CycleGAN). 5. **Data augmentation**. 6. **Anomaly detection** (AnoGAN). ## 💻 패턴 ### DCGAN (PyTorch) ```python import torch import torch.nn as nn class Generator(nn.Module): def __init__(self, latent=100, img_dim=64): super().__init__() self.net = nn.Sequential( nn.ConvTranspose2d(latent, 512, 4, 1, 0), nn.BatchNorm2d(512), nn.ReLU(), nn.ConvTranspose2d(512, 256, 4, 2, 1), nn.BatchNorm2d(256), nn.ReLU(), nn.ConvTranspose2d(256, 128, 4, 2, 1), nn.BatchNorm2d(128), nn.ReLU(), nn.ConvTranspose2d(128, 64, 4, 2, 1), nn.BatchNorm2d(64), nn.ReLU(), nn.ConvTranspose2d(64, 3, 4, 2, 1), nn.Tanh(), ) def forward(self, z): return self.net(z.unsqueeze(-1).unsqueeze(-1)) class Discriminator(nn.Module): def __init__(self): super().__init__() self.net = nn.Sequential( nn.Conv2d(3, 64, 4, 2, 1), nn.LeakyReLU(0.2), nn.Conv2d(64, 128, 4, 2, 1), nn.BatchNorm2d(128), nn.LeakyReLU(0.2), nn.Conv2d(128, 256, 4, 2, 1), nn.BatchNorm2d(256), nn.LeakyReLU(0.2), nn.Conv2d(256, 1, 4, 1, 0), nn.Sigmoid(), ) def forward(self, x): return self.net(x).view(-1) ``` ### Train loop ```python G, D = Generator(), Discriminator() opt_G = torch.optim.Adam(G.parameters(), lr=2e-4, betas=(0.5, 0.999)) opt_D = torch.optim.Adam(D.parameters(), lr=2e-4, betas=(0.5, 0.999)) bce = nn.BCELoss() for batch in dataloader: real = batch.cuda() bs = real.size(0) z = torch.randn(bs, 100).cuda() fake = G(z) # 매 D opt_D.zero_grad() d_real = D(real) d_fake = D(fake.detach()) d_loss = bce(d_real, torch.ones(bs).cuda()) + bce(d_fake, torch.zeros(bs).cuda()) d_loss.backward(); opt_D.step() # 매 G opt_G.zero_grad() d_fake_g = D(fake) g_loss = bce(d_fake_g, torch.ones(bs).cuda()) g_loss.backward(); opt_G.step() ``` ### WGAN-GP loss ```python def gradient_penalty(D, real, fake, device): bs = real.size(0) alpha = torch.rand(bs, 1, 1, 1, device=device) interp = alpha * real + (1 - alpha) * fake interp.requires_grad_(True) d_interp = D(interp) grads = torch.autograd.grad(d_interp.sum(), interp, create_graph=True)[0] return ((grads.norm(2, dim=[1,2,3]) - 1) ** 2).mean() # 매 D loss d_loss = D(fake).mean() - D(real).mean() + 10 * gradient_penalty(D, real, fake, device) ``` ### CycleGAN (unpaired translation) ```python class CycleGAN: def __init__(self): self.G_AB, self.G_BA = Generator(), Generator() self.D_A, self.D_B = Discriminator(), Discriminator() def cycle_loss(self, real_A, real_B): fake_B = self.G_AB(real_A) rec_A = self.G_BA(fake_B) cycle_A = (rec_A - real_A).abs().mean() fake_A = self.G_BA(real_B) rec_B = self.G_AB(fake_A) cycle_B = (rec_B - real_B).abs().mean() return cycle_A + cycle_B ``` ### StyleGAN (style modulation) ```python class StyleBlock(nn.Module): def __init__(self, in_ch, out_ch, style_dim=512): super().__init__() self.conv = nn.Conv2d(in_ch, out_ch, 3, padding=1) self.style_proj = nn.Linear(style_dim, in_ch) def forward(self, x, style): scale = self.style_proj(style).unsqueeze(-1).unsqueeze(-1) x = x * scale return self.conv(x) ``` ### Spectral normalization ```python from torch.nn.utils import spectral_norm class SNDiscriminator(nn.Module): def __init__(self): super().__init__() self.conv1 = spectral_norm(nn.Conv2d(3, 64, 4, 2, 1)) # ... ``` ### Mode collapse detection ```python def diversity_check(generator, n=1000): z = torch.randn(n, 100).cuda() fake = generator(z) # 매 LPIPS pairwise distances = [] for i in range(min(100, n)): for j in range(i+1, min(100, n)): distances.append(lpips(fake[i:i+1], fake[j:j+1]).item()) return np.mean(distances) # 매 low = mode collapse ``` ### FID (eval) ```python from torchmetrics.image.fid import FrechetInceptionDistance fid = FrechetInceptionDistance().cuda() fid.update(real_imgs, real=True) fid.update(fake_imgs, real=False) print(fid.compute()) # 매 lower = better ``` ### GAN inversion (project image → latent) ```python def gan_invert(target_image, G, n_iter=1000): z = torch.randn(1, 100, requires_grad=True, device='cuda') optim = torch.optim.Adam([z], lr=0.01) for _ in range(n_iter): gen = G(z) loss = ((gen - target_image) ** 2).mean() + lpips_loss(gen, target_image) optim.zero_grad(); loss.backward(); optim.step() return z ``` ### Conditional (class-conditional) ```python class CondG(nn.Module): def __init__(self, n_classes): super().__init__() self.embed = nn.Embedding(n_classes, 100) self.gen = Generator() def forward(self, z, label): return self.gen(z + self.embed(label)) ``` ### ESRGAN (super-resolution) ```python # 매 conceptual class RRDB(nn.Module): # 매 residual-in-residual dense block pass class ESRGAN(nn.Module): def __init__(self): super().__init__() self.body = nn.ModuleList([RRDB() for _ in range(23)]) self.upsample = nn.Sequential( nn.Conv2d(64, 64*4, 3, padding=1), nn.PixelShuffle(2), nn.Conv2d(64, 3, 3, padding=1), ) ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Modern image gen | Diffusion (not GAN) | | Face generation | StyleGAN3 | | Unpaired translation | CycleGAN | | Paired translation | Pix2Pix | | Super-resolution | ESRGAN | | Fast inference | GAN > Diffusion | | Editing | GAN inversion | **기본값**: 매 modern = diffusion. 매 face = StyleGAN3. 매 niche translation = CycleGAN. 매 SR = ESRGAN. 매 always FID + diversity check. ## 🔗 Graph - 부모: [[Deep Learning]] · [[Generative-AI|Generative-Models]] - 변형: [[StyleGAN]] · [[CycleGAN]] · [[Pix2Pix]] - 응용: [[Data-Augmentation]] - Adjacent: [[Diffusion-Models]] · [[VAE]] · [[Generative-AI]] ## 🤖 LLM 활용 **언제**: 매 fast image gen. 매 unpaired translation. 매 face. **언제 X**: 매 text-to-image (use diffusion). ## ❌ 안티패턴 - **Mode collapse 의 ignore**: 매 limited diversity. - **No spectral norm**: 매 unstable D. - **Imbalanced D-G**: 매 collapse. - **No FID**: 매 quality 의 invisible. - **GAN for text-to-image**: 매 diffusion 이 better. ## 🧪 검증 / 중복 - Verified (Goodfellow 2014, Karras StyleGAN, Zhu CycleGAN). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-04-26 | Auto | | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — DCGAN/WGAN/Style/Cycle + 매 train / inversion / FID code |