f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
145 lines
5.0 KiB
Markdown
145 lines
5.0 KiB
Markdown
---
|
|
id: wiki-2026-0508-style-transfer
|
|
title: Style Transfer
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [Neural Style Transfer, NST, Style Transfer in AI]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [style-transfer, neural-network, image-generation, diffusion]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: python
|
|
framework: pytorch
|
|
---
|
|
|
|
# Style Transfer
|
|
|
|
## 매 한 줄
|
|
> **"매 content + style separation의 art"**. Gatys et al. (2015) 가 VGG feature space 의 Gram matrix 로 style 추출 → 매 content image 에 transfer 의 seminal work. 2026 의 modern state 는 diffusion-based (IP-Adapter, ControlNet style) + Midjourney --sref 의 mainstream.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 origin (Gatys 2015)
|
|
- VGG-19 의 conv layer activation 의 feature representation.
|
|
- **Content loss**: 매 high-level layer (conv4_2) 의 feature MSE.
|
|
- **Style loss**: 매 multiple layer 의 Gram matrix (feature correlation) MSE.
|
|
- Optimization-based — 매 image pixel 자체 의 gradient descent (slow, ~minutes per image).
|
|
|
|
### 매 evolution
|
|
- **Fast NST** (Johnson 2016): feedforward network 의 single forward pass.
|
|
- **AdaIN** (Huang 2017): Adaptive Instance Normalization — 매 arbitrary style 의 real-time.
|
|
- **Diffusion-based** (2023+): IP-Adapter, ControlNet — 매 prompt + reference 의 zero-shot.
|
|
|
|
### 매 응용
|
|
1. Artistic image generation (Prisma, DeepArt — 매 historical).
|
|
2. Midjourney --sref / --cref — 매 mainstream creative tool.
|
|
3. Video stylization (Runway, Kaiber).
|
|
4. Domain adaptation (synthetic → real).
|
|
|
|
## 💻 패턴
|
|
|
|
### Gram matrix (style representation)
|
|
```python
|
|
import torch
|
|
import torch.nn as nn
|
|
|
|
def gram_matrix(features):
|
|
b, c, h, w = features.shape
|
|
feat = features.view(b, c, h * w)
|
|
gram = torch.bmm(feat, feat.transpose(1, 2))
|
|
return gram / (c * h * w)
|
|
```
|
|
|
|
### AdaIN
|
|
```python
|
|
def adain(content_feat, style_feat, eps=1e-5):
|
|
c_mean = content_feat.mean(dim=[2, 3], keepdim=True)
|
|
c_std = content_feat.std(dim=[2, 3], keepdim=True) + eps
|
|
s_mean = style_feat.mean(dim=[2, 3], keepdim=True)
|
|
s_std = style_feat.std(dim=[2, 3], keepdim=True) + eps
|
|
normalized = (content_feat - c_mean) / c_std
|
|
return normalized * s_std + s_mean
|
|
```
|
|
|
|
### Gatys optimization (full)
|
|
```python
|
|
import torch.optim as optim
|
|
from torchvision.models import vgg19
|
|
|
|
vgg = vgg19(pretrained=True).features.eval().cuda()
|
|
target = content_img.clone().requires_grad_(True)
|
|
optimizer = optim.LBFGS([target])
|
|
|
|
def closure():
|
|
optimizer.zero_grad()
|
|
feats = extract_features(target, vgg)
|
|
c_loss = F.mse_loss(feats['content'], content_feats['content'])
|
|
s_loss = sum(F.mse_loss(gram_matrix(feats[l]), gram_matrix(style_feats[l]))
|
|
for l in style_layers)
|
|
loss = c_loss + 1e6 * s_loss
|
|
loss.backward()
|
|
return loss
|
|
|
|
for _ in range(300):
|
|
optimizer.step(closure)
|
|
```
|
|
|
|
### IP-Adapter (diffusion-based, 2024+)
|
|
```python
|
|
from diffusers import StableDiffusionPipeline
|
|
from ip_adapter import IPAdapter
|
|
|
|
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5").to("cuda")
|
|
ip_model = IPAdapter(pipe, "h94/IP-Adapter", "models/ip-adapter_sd15.bin", "cuda")
|
|
|
|
style_ref = Image.open("vangogh.jpg")
|
|
images = ip_model.generate(pil_image=style_ref, prompt="a cat", num_samples=4, scale=0.7)
|
|
```
|
|
|
|
### Midjourney --sref (2026 mainstream)
|
|
```
|
|
/imagine prompt: a serene lake at sunset --sref https://example.com/style.jpg --sw 100
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | Approach |
|
|
|---|---|
|
|
| 매 일회성 art experiment | Gatys (구현 simple, slow ok) |
|
|
| 매 real-time / video | AdaIN, fast NST |
|
|
| 매 production creative | IP-Adapter + SDXL / FLUX |
|
|
| 매 non-coder creative | Midjourney --sref |
|
|
| 매 controllable structure + style | ControlNet + IP-Adapter combo |
|
|
|
|
**기본값**: 매 2026 의 IP-Adapter (open) 또는 Midjourney --sref (closed).
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[Generative-AI]] · [[Computer Vision|Computer-Vision]]
|
|
- 변형: [[Style_Reference_(--sref)]] · [[ControlNet]] · [[IP-Adapter]]
|
|
- 응용: [[AI 이미지 생성 (AI Image Generation)|Image-Generation]]
|
|
- Adjacent: [[Diffusion-Models]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: 매 creative pipeline 의 style consistency, brand asset variant 생성, mood board 의 visual exploration.
|
|
**언제 X**: 매 photo retouching (use Lightroom), 매 strict color grading (use LUTs), 매 face identity preservation 의 unstable.
|
|
|
|
## ❌ 안티패턴
|
|
- **Style weight 무한 증가**: 매 content 가 사라짐. balance 필수 (1e6 typical).
|
|
- **Single VGG layer**: 매 multi-scale style 의 lost. 매 multiple layer aggregate.
|
|
- **Diffusion 의 prompt 무시**: IP-Adapter scale 너무 높으면 prompt 의 무시. scale 0.5-0.8 sweet spot.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (Gatys et al. 2015 "A Neural Algorithm of Artistic Style"; Huang & Belongie 2017 AdaIN; Ye et al. 2023 IP-Adapter).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — Gatys → AdaIN → diffusion (IP-Adapter, --sref) coverage |
|