"매 GAN 의 first 의 stable architecture" (Radford 2015). 매 stride conv + 매 batch norm + 매 specific activation. 매 generative AI 의 grandparent. 매 modern: 매 Diffusion 의 superseded 가, 매 fast inference / GAN-based super-res / image-to-image 의 still relevant.
매 핵심
매 GAN basics (Goodfellow 2014)
Generator: 매 noise → 매 image.
Discriminator: 매 real vs fake.
Min-max game.
DCGAN (2015) 의 contribution
Strided conv (no pooling).
BatchNorm in both G and D.
No fully-connected hidden layer.
ReLU in G (Tanh output).
LeakyReLU in D.
매 famous GAN evolution
DCGAN (2015): 매 stable.
WGAN (2017): 매 Wasserstein loss, 매 mode collapse 의 mitigate.
Pix2Pix (2017): 매 image-to-image.
CycleGAN (2017): 매 unpaired.
StyleGAN (2018-2021): 매 face quality SOTA.
BigGAN (2018): 매 large-scale.
GigaGAN (2023): 매 text-to-image GAN.
매 mode collapse
매 G 의 매 limited variety 의 generate.
매 mitigation: 매 minibatch discrimination, 매 spectral norm, 매 WGAN.
매 evaluation
FID (Fréchet Inception Distance): 매 generated vs real 의 distance.
IS (Inception Score).
Precision / Recall (Kynkäänniemi 2019).
매 modern relevance
Super-resolution (ESRGAN, Real-ESRGAN).
Image-to-image (CycleGAN 의 still useful).
Domain adaptation (sim2real).
Implicit EBM.
Fast generation (vs slow diffusion).
매 vs Diffusion
측면
GAN
Diffusion
Quality
High (StyleGAN)
Highest
Diversity
Mode collapse risk
High
Training stability
Tricky
Stable
Inference speed
Fast (single step)
Slow (multi-step)
Conditioning
Hard
Easy (CLIP)
💻 패턴
DCGAN (PyTorch)
importtorch.nnasnnclassGenerator(nn.Module):def__init__(self,nz=100,ngf=64,nc=3):super().__init__()self.main=nn.Sequential(# 매 input: nz × 1 × 1nn.ConvTranspose2d(nz,ngf*8,4,1,0,bias=False),nn.BatchNorm2d(ngf*8),nn.ReLU(True),# 매 ngf*8 × 4 × 4nn.ConvTranspose2d(ngf*8,ngf*4,4,2,1,bias=False),nn.BatchNorm2d(ngf*4),nn.ReLU(True),# 매 ... → 매 nc × 64 × 64nn.ConvTranspose2d(ngf,nc,4,2,1,bias=False),nn.Tanh(),)defforward(self,z):returnself.main(z)classDiscriminator(nn.Module):def__init__(self,ndf=64,nc=3):super().__init__()self.main=nn.Sequential(nn.Conv2d(nc,ndf,4,2,1,bias=False),nn.LeakyReLU(0.2,True),nn.Conv2d(ndf,ndf*2,4,2,1,bias=False),nn.BatchNorm2d(ndf*2),nn.LeakyReLU(0.2,True),# 매 ... nn.Conv2d(ndf*8,1,4,1,0,bias=False),nn.Sigmoid(),)defforward(self,x):returnself.main(x).view(-1)
Training loop
G=Generator().to('cuda')D=Discriminator().to('cuda')opt_g=torch.optim.Adam(G.parameters(),lr=2e-4,betas=(0.5,0.999))opt_d=torch.optim.Adam(D.parameters(),lr=2e-4,betas=(0.5,0.999))criterion=nn.BCELoss()forreal,_inloader:real=real.to('cuda')bs=real.size(0)# 매 D stepopt_d.zero_grad()d_real=D(real)loss_real=criterion(d_real,torch.ones(bs).to('cuda'))z=torch.randn(bs,100,1,1).to('cuda')fake=G(z)d_fake=D(fake.detach())loss_fake=criterion(d_fake,torch.zeros(bs).to('cuda'))(loss_real+loss_fake).backward()opt_d.step()# 매 G stepopt_g.zero_grad()d_fake=D(fake)loss_g=criterion(d_fake,torch.ones(bs).to('cuda'))loss_g.backward()opt_g.step()
WGAN-GP (modern stable)
defgradient_penalty(D,real,fake):bs=real.size(0)alpha=torch.rand(bs,1,1,1).to(real.device)interp=alpha*real+(1-alpha)*fakeinterp.requires_grad_()d_interp=D(interp)grads=torch.autograd.grad(d_interp.sum(),interp,create_graph=True)[0]grad_norm=grads.view(bs,-1).norm(2,dim=1)return((grad_norm-1)**2).mean()# 매 D steploss_d=D(fake).mean()-D(real).mean()+10*gradient_penalty(D,real,fake)
Real-ESRGAN (super-resolution, modern application)
# 매 매 horse → 매 zebra (no pair)# 매 G_ab: 매 A → B, 매 G_ba: 매 B → A# 매 cycle loss: 매 G_ba(G_ab(a)) ≈ a
StyleGAN inversion (modern)
fromstylegan2_pytorchimportTrainer# 매 매 image 의 latent z 의 finddefinvert(image,generator,n_iters=1000):z=torch.randn(1,512,requires_grad=True)for_inrange(n_iters):gen=generator(z)loss=(gen-image).pow(2).mean()loss.backward()z.data-=0.01*z.grad.dataz.grad.zero_()returnz
FID evaluation
frompytorch_fidimportfid_scorefid=fid_score.calculate_fid_given_paths(['./real_images/','./fake_images/'],batch_size=50,device='cuda',dims=2048,)print(f'FID: {fid:.2f}')# 매 lower 의 better
매 결정 기준
응용
Method
Photoreal generation
Diffusion (SDXL, Flux)
Face
StyleGAN3
Super-resolution
Real-ESRGAN
Domain adapt (unpaired)
CycleGAN
Sim2Real
CycleGAN / paired
Fast inference
GAN > Diffusion
Quality + control
Diffusion + ControlNet
기본값: 매 modern generation = Diffusion. 매 GAN = 매 SR / I2I 의 still.