Files
2nd/10_Wiki/Topics/AI_and_ML/Deep-Convolutional-GANs.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

256 lines
7.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-dcgan
title: DCGAN (Deep Convolutional GAN)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [DCGAN, GAN, generative adversarial network, StyleGAN, CycleGAN, Pix2Pix]
duplicate_of: none
source_trust_level: A
confidence_score: 0.88
verification_status: applied
tags: [gan, dcgan, generative-models, deep-learning, image-generation, stylegan, cyclegan, history]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: Python
framework: PyTorch
---
# DCGAN
## 매 한 줄
> **"매 GAN 의 first 의 stable architecture"** (Radford 2015). 매 stride conv + 매 batch norm + 매 specific activation. 매 generative AI 의 grandparent. 매 modern: 매 Diffusion 의 superseded 가, 매 fast inference / GAN-based super-res / image-to-image 의 still relevant.
## 매 핵심
### 매 GAN basics (Goodfellow 2014)
- **Generator**: 매 noise → 매 image.
- **Discriminator**: 매 real vs fake.
- **Min-max game**.
### DCGAN (2015) 의 contribution
1. **Strided conv** (no pooling).
2. **BatchNorm** in both G and D.
3. **No fully-connected hidden** layer.
4. **ReLU in G** (Tanh output).
5. **LeakyReLU in D**.
### 매 famous GAN evolution
- **DCGAN** (2015): 매 stable.
- **WGAN** (2017): 매 Wasserstein loss, 매 mode collapse 의 mitigate.
- **Pix2Pix** (2017): 매 image-to-image.
- **CycleGAN** (2017): 매 unpaired.
- **StyleGAN** (2018-2021): 매 face quality SOTA.
- **BigGAN** (2018): 매 large-scale.
- **GigaGAN** (2023): 매 text-to-image GAN.
### 매 mode collapse
- 매 G 의 매 limited variety 의 generate.
- 매 mitigation: 매 minibatch discrimination, 매 spectral norm, 매 WGAN.
### 매 evaluation
- **FID** (Fréchet Inception Distance): 매 generated vs real 의 distance.
- **IS** (Inception Score).
- **Precision / Recall** (Kynkäänniemi 2019).
### 매 modern relevance
- **Super-resolution** (ESRGAN, Real-ESRGAN).
- **Image-to-image** (CycleGAN 의 still useful).
- **Domain adaptation** (sim2real).
- **Implicit EBM**.
- **Fast generation** (vs slow diffusion).
### 매 vs Diffusion
| 측면 | GAN | Diffusion |
|---|---|---|
| Quality | High (StyleGAN) | Highest |
| Diversity | Mode collapse risk | High |
| Training stability | Tricky | Stable |
| Inference speed | Fast (single step) | Slow (multi-step) |
| Conditioning | Hard | Easy (CLIP) |
## 💻 패턴
### DCGAN (PyTorch)
```python
import torch.nn as nn
class Generator(nn.Module):
def __init__(self, nz=100, ngf=64, nc=3):
super().__init__()
self.main = nn.Sequential(
# 매 input: nz × 1 × 1
nn.ConvTranspose2d(nz, ngf*8, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf*8),
nn.ReLU(True),
# 매 ngf*8 × 4 × 4
nn.ConvTranspose2d(ngf*8, ngf*4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf*4),
nn.ReLU(True),
# 매 ... → 매 nc × 64 × 64
nn.ConvTranspose2d(ngf, nc, 4, 2, 1, bias=False),
nn.Tanh(),
)
def forward(self, z):
return self.main(z)
class Discriminator(nn.Module):
def __init__(self, ndf=64, nc=3):
super().__init__()
self.main = nn.Sequential(
nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
nn.LeakyReLU(0.2, True),
nn.Conv2d(ndf, ndf*2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf*2),
nn.LeakyReLU(0.2, True),
# 매 ...
nn.Conv2d(ndf*8, 1, 4, 1, 0, bias=False),
nn.Sigmoid(),
)
def forward(self, x):
return self.main(x).view(-1)
```
### Training loop
```python
G = Generator().to('cuda')
D = Discriminator().to('cuda')
opt_g = torch.optim.Adam(G.parameters(), lr=2e-4, betas=(0.5, 0.999))
opt_d = torch.optim.Adam(D.parameters(), lr=2e-4, betas=(0.5, 0.999))
criterion = nn.BCELoss()
for real, _ in loader:
real = real.to('cuda')
bs = real.size(0)
# 매 D step
opt_d.zero_grad()
d_real = D(real)
loss_real = criterion(d_real, torch.ones(bs).to('cuda'))
z = torch.randn(bs, 100, 1, 1).to('cuda')
fake = G(z)
d_fake = D(fake.detach())
loss_fake = criterion(d_fake, torch.zeros(bs).to('cuda'))
(loss_real + loss_fake).backward()
opt_d.step()
# 매 G step
opt_g.zero_grad()
d_fake = D(fake)
loss_g = criterion(d_fake, torch.ones(bs).to('cuda'))
loss_g.backward()
opt_g.step()
```
### WGAN-GP (modern stable)
```python
def gradient_penalty(D, real, fake):
bs = real.size(0)
alpha = torch.rand(bs, 1, 1, 1).to(real.device)
interp = alpha * real + (1 - alpha) * fake
interp.requires_grad_()
d_interp = D(interp)
grads = torch.autograd.grad(d_interp.sum(), interp, create_graph=True)[0]
grad_norm = grads.view(bs, -1).norm(2, dim=1)
return ((grad_norm - 1) ** 2).mean()
# 매 D step
loss_d = D(fake).mean() - D(real).mean() + 10 * gradient_penalty(D, real, fake)
```
### Real-ESRGAN (super-resolution, modern application)
```python
from realesrgan import RealESRGAN
from PIL import Image
model = RealESRGAN(device='cuda', scale=4)
model.load_weights('weights/RealESRGAN_x4.pth', download=True)
img = Image.open('low_res.jpg')
sr_img = model.predict(img)
sr_img.save('high_res.jpg')
```
### CycleGAN (unpaired image-to-image)
```python
# 매 매 horse → 매 zebra (no pair)
# 매 G_ab: 매 A → B, 매 G_ba: 매 B → A
# 매 cycle loss: 매 G_ba(G_ab(a)) ≈ a
```
### StyleGAN inversion (modern)
```python
from stylegan2_pytorch import Trainer
# 매 매 image 의 latent z 의 find
def invert(image, generator, n_iters=1000):
z = torch.randn(1, 512, requires_grad=True)
for _ in range(n_iters):
gen = generator(z)
loss = (gen - image).pow(2).mean()
loss.backward()
z.data -= 0.01 * z.grad.data
z.grad.zero_()
return z
```
### FID evaluation
```python
from pytorch_fid import fid_score
fid = fid_score.calculate_fid_given_paths(
['./real_images/', './fake_images/'],
batch_size=50,
device='cuda',
dims=2048,
)
print(f'FID: {fid:.2f}') # 매 lower 의 better
```
## 매 결정 기준
| 응용 | Method |
|---|---|
| Photoreal generation | Diffusion (SDXL, Flux) |
| Face | StyleGAN3 |
| Super-resolution | Real-ESRGAN |
| Domain adapt (unpaired) | CycleGAN |
| Sim2Real | CycleGAN / paired |
| Fast inference | GAN > Diffusion |
| Quality + control | Diffusion + ControlNet |
**기본값**: 매 modern generation = Diffusion. 매 GAN = 매 SR / I2I 의 still.
## 🔗 Graph
- 부모: [[Generative-AI|Generative-Models]] · [[Deep Learning]]
- 변형: [[Generative-Adversarial-Networks|GAN]] · [[StyleGAN]] · [[Pix2Pix]] · [[CycleGAN]]
- 응용: [[Domain-Adaptation]] · [[CV_Synthesis]]
- Adjacent: [[Diffusion-Models]] · [[Auto-Encoding]] · [[Stable-Diffusion]] · [[Deepfake-Technology]]
## 🤖 LLM 활용
**언제**: 매 GAN history. 매 fast generation. 매 super-res. 매 image-to-image.
**언제 X**: 매 highest quality (use diffusion).
## ❌ 안티패턴
- **DCGAN 의 production 의 force**: 매 modern 의 diffusion 더 좋음.
- **No mode collapse check**: 매 single output.
- **WGAN-GP 없 의 unstable**: 매 training fail.
- **FID 만 의 trust**: 매 다른 metric 도.
## 🧪 검증 / 중복
- Verified (Goodfellow GAN, Radford DCGAN, Karras StyleGAN, Real-ESRGAN).
- 신뢰도 A.
- Related: [[Diffusion-Models]] · [[Auto-Encoding]] · [[Stable-Diffusion]] · [[Deepfake-Technology]] · [[CV_Synthesis]] · [[Bioenergetics]] (model collapse).
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — DCGAN architecture + GAN evolution + 매 PyTorch / WGAN-GP / Real-ESRGAN code |