Files
2nd/10_Wiki/Topics/AI_and_ML/Deep-Convolutional-GANs.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

256 lines
7.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-dcgan
title: DCGAN (Deep Convolutional GAN)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [DCGAN, GAN, generative adversarial network, StyleGAN, CycleGAN, Pix2Pix]
duplicate_of: none
source_trust_level: A
confidence_score: 0.88
verification_status: applied
tags: [gan, dcgan, generative-models, deep-learning, image-generation, stylegan, cyclegan, history]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: Python
framework: PyTorch
---
# DCGAN
## 매 한 줄
> **"매 GAN 의 first 의 stable architecture"** (Radford 2015). 매 stride conv + 매 batch norm + 매 specific activation. 매 generative AI 의 grandparent. 매 modern: 매 Diffusion 의 superseded 가, 매 fast inference / GAN-based super-res / image-to-image 의 still relevant.
## 매 핵심
### 매 GAN basics (Goodfellow 2014)
- **Generator**: 매 noise → 매 image.
- **Discriminator**: 매 real vs fake.
- **Min-max game**.
### DCGAN (2015) 의 contribution
1. **Strided conv** (no pooling).
2. **BatchNorm** in both G and D.
3. **No fully-connected hidden** layer.
4. **ReLU in G** (Tanh output).
5. **LeakyReLU in D**.
### 매 famous GAN evolution
- **DCGAN** (2015): 매 stable.
- **WGAN** (2017): 매 Wasserstein loss, 매 mode collapse 의 mitigate.
- **Pix2Pix** (2017): 매 image-to-image.
- **CycleGAN** (2017): 매 unpaired.
- **StyleGAN** (2018-2021): 매 face quality SOTA.
- **BigGAN** (2018): 매 large-scale.
- **GigaGAN** (2023): 매 text-to-image GAN.
### 매 mode collapse
- 매 G 의 매 limited variety 의 generate.
- 매 mitigation: 매 minibatch discrimination, 매 spectral norm, 매 WGAN.
### 매 evaluation
- **FID** (Fréchet Inception Distance): 매 generated vs real 의 distance.
- **IS** (Inception Score).
- **Precision / Recall** (Kynkäänniemi 2019).
### 매 modern relevance
- **Super-resolution** (ESRGAN, Real-ESRGAN).
- **Image-to-image** (CycleGAN 의 still useful).
- **Domain adaptation** (sim2real).
- **Implicit EBM**.
- **Fast generation** (vs slow diffusion).
### 매 vs Diffusion
| 측면 | GAN | Diffusion |
|---|---|---|
| Quality | High (StyleGAN) | Highest |
| Diversity | Mode collapse risk | High |
| Training stability | Tricky | Stable |
| Inference speed | Fast (single step) | Slow (multi-step) |
| Conditioning | Hard | Easy (CLIP) |
## 💻 패턴
### DCGAN (PyTorch)
```python
import torch.nn as nn
class Generator(nn.Module):
def __init__(self, nz=100, ngf=64, nc=3):
super().__init__()
self.main = nn.Sequential(
# 매 input: nz × 1 × 1
nn.ConvTranspose2d(nz, ngf*8, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf*8),
nn.ReLU(True),
# 매 ngf*8 × 4 × 4
nn.ConvTranspose2d(ngf*8, ngf*4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf*4),
nn.ReLU(True),
# 매 ... → 매 nc × 64 × 64
nn.ConvTranspose2d(ngf, nc, 4, 2, 1, bias=False),
nn.Tanh(),
)
def forward(self, z):
return self.main(z)
class Discriminator(nn.Module):
def __init__(self, ndf=64, nc=3):
super().__init__()
self.main = nn.Sequential(
nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
nn.LeakyReLU(0.2, True),
nn.Conv2d(ndf, ndf*2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf*2),
nn.LeakyReLU(0.2, True),
# 매 ...
nn.Conv2d(ndf*8, 1, 4, 1, 0, bias=False),
nn.Sigmoid(),
)
def forward(self, x):
return self.main(x).view(-1)
```
### Training loop
```python
G = Generator().to('cuda')
D = Discriminator().to('cuda')
opt_g = torch.optim.Adam(G.parameters(), lr=2e-4, betas=(0.5, 0.999))
opt_d = torch.optim.Adam(D.parameters(), lr=2e-4, betas=(0.5, 0.999))
criterion = nn.BCELoss()
for real, _ in loader:
real = real.to('cuda')
bs = real.size(0)
# 매 D step
opt_d.zero_grad()
d_real = D(real)
loss_real = criterion(d_real, torch.ones(bs).to('cuda'))
z = torch.randn(bs, 100, 1, 1).to('cuda')
fake = G(z)
d_fake = D(fake.detach())
loss_fake = criterion(d_fake, torch.zeros(bs).to('cuda'))
(loss_real + loss_fake).backward()
opt_d.step()
# 매 G step
opt_g.zero_grad()
d_fake = D(fake)
loss_g = criterion(d_fake, torch.ones(bs).to('cuda'))
loss_g.backward()
opt_g.step()
```
### WGAN-GP (modern stable)
```python
def gradient_penalty(D, real, fake):
bs = real.size(0)
alpha = torch.rand(bs, 1, 1, 1).to(real.device)
interp = alpha * real + (1 - alpha) * fake
interp.requires_grad_()
d_interp = D(interp)
grads = torch.autograd.grad(d_interp.sum(), interp, create_graph=True)[0]
grad_norm = grads.view(bs, -1).norm(2, dim=1)
return ((grad_norm - 1) ** 2).mean()
# 매 D step
loss_d = D(fake).mean() - D(real).mean() + 10 * gradient_penalty(D, real, fake)
```
### Real-ESRGAN (super-resolution, modern application)
```python
from realesrgan import RealESRGAN
from PIL import Image
model = RealESRGAN(device='cuda', scale=4)
model.load_weights('weights/RealESRGAN_x4.pth', download=True)
img = Image.open('low_res.jpg')
sr_img = model.predict(img)
sr_img.save('high_res.jpg')
```
### CycleGAN (unpaired image-to-image)
```python
# 매 매 horse → 매 zebra (no pair)
# 매 G_ab: 매 A → B, 매 G_ba: 매 B → A
# 매 cycle loss: 매 G_ba(G_ab(a)) ≈ a
```
### StyleGAN inversion (modern)
```python
from stylegan2_pytorch import Trainer
# 매 매 image 의 latent z 의 find
def invert(image, generator, n_iters=1000):
z = torch.randn(1, 512, requires_grad=True)
for _ in range(n_iters):
gen = generator(z)
loss = (gen - image).pow(2).mean()
loss.backward()
z.data -= 0.01 * z.grad.data
z.grad.zero_()
return z
```
### FID evaluation
```python
from pytorch_fid import fid_score
fid = fid_score.calculate_fid_given_paths(
['./real_images/', './fake_images/'],
batch_size=50,
device='cuda',
dims=2048,
)
print(f'FID: {fid:.2f}') # 매 lower 의 better
```
## 매 결정 기준
| 응용 | Method |
|---|---|
| Photoreal generation | Diffusion (SDXL, Flux) |
| Face | StyleGAN3 |
| Super-resolution | Real-ESRGAN |
| Domain adapt (unpaired) | CycleGAN |
| Sim2Real | CycleGAN / paired |
| Fast inference | GAN > Diffusion |
| Quality + control | Diffusion + ControlNet |
**기본값**: 매 modern generation = Diffusion. 매 GAN = 매 SR / I2I 의 still.
## 🔗 Graph
- 부모: [[Generative-AI|Generative-Models]] · [[Deep-Learning]]
- 변형: [[Generative-Adversarial-Networks|GAN]] · [[StyleGAN]] · [[Pix2Pix]] · [[CycleGAN]]
- 응용: [[Domain-Adaptation]] · [[CV_Synthesis]]
- Adjacent: [[Diffusion-Models]] · [[Auto-Encoding]] · [[Stable-Diffusion]] · [[Deepfake-Technology]]
## 🤖 LLM 활용
**언제**: 매 GAN history. 매 fast generation. 매 super-res. 매 image-to-image.
**언제 X**: 매 highest quality (use diffusion).
## ❌ 안티패턴
- **DCGAN 의 production 의 force**: 매 modern 의 diffusion 더 좋음.
- **No mode collapse check**: 매 single output.
- **WGAN-GP 없 의 unstable**: 매 training fail.
- **FID 만 의 trust**: 매 다른 metric 도.
## 🧪 검증 / 중복
- Verified (Goodfellow GAN, Radford DCGAN, Karras StyleGAN, Real-ESRGAN).
- 신뢰도 A.
- Related: [[Diffusion-Models]] · [[Auto-Encoding]] · [[Stable-Diffusion]] · [[Deepfake-Technology]] · [[CV_Synthesis]] · [[Bioenergetics]] (model collapse).
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — DCGAN architecture + GAN evolution + 매 PyTorch / WGAN-GP / Real-ESRGAN code |