Files
2nd/10_Wiki/Topics/AI_and_ML/Style-Transfer.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

5.0 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-style-transfer Style Transfer 10_Wiki/Topics verified self
Neural Style Transfer
NST
Style Transfer in AI
none A 0.9 applied
style-transfer
neural-network
image-generation
diffusion
2026-05-10 pending
language framework
python pytorch

Style Transfer

매 한 줄

"매 content + style separation의 art". Gatys et al. (2015) 가 VGG feature space 의 Gram matrix 로 style 추출 → 매 content image 에 transfer 의 seminal work. 2026 의 modern state 는 diffusion-based (IP-Adapter, ControlNet style) + Midjourney --sref 의 mainstream.

매 핵심

매 origin (Gatys 2015)

  • VGG-19 의 conv layer activation 의 feature representation.
  • Content loss: 매 high-level layer (conv4_2) 의 feature MSE.
  • Style loss: 매 multiple layer 의 Gram matrix (feature correlation) MSE.
  • Optimization-based — 매 image pixel 자체 의 gradient descent (slow, ~minutes per image).

매 evolution

  • Fast NST (Johnson 2016): feedforward network 의 single forward pass.
  • AdaIN (Huang 2017): Adaptive Instance Normalization — 매 arbitrary style 의 real-time.
  • Diffusion-based (2023+): IP-Adapter, ControlNet — 매 prompt + reference 의 zero-shot.

매 응용

  1. Artistic image generation (Prisma, DeepArt — 매 historical).
  2. Midjourney --sref / --cref — 매 mainstream creative tool.
  3. Video stylization (Runway, Kaiber).
  4. Domain adaptation (synthetic → real).

💻 패턴

Gram matrix (style representation)

import torch
import torch.nn as nn

def gram_matrix(features):
    b, c, h, w = features.shape
    feat = features.view(b, c, h * w)
    gram = torch.bmm(feat, feat.transpose(1, 2))
    return gram / (c * h * w)

AdaIN

def adain(content_feat, style_feat, eps=1e-5):
    c_mean = content_feat.mean(dim=[2, 3], keepdim=True)
    c_std = content_feat.std(dim=[2, 3], keepdim=True) + eps
    s_mean = style_feat.mean(dim=[2, 3], keepdim=True)
    s_std = style_feat.std(dim=[2, 3], keepdim=True) + eps
    normalized = (content_feat - c_mean) / c_std
    return normalized * s_std + s_mean

Gatys optimization (full)

import torch.optim as optim
from torchvision.models import vgg19

vgg = vgg19(pretrained=True).features.eval().cuda()
target = content_img.clone().requires_grad_(True)
optimizer = optim.LBFGS([target])

def closure():
    optimizer.zero_grad()
    feats = extract_features(target, vgg)
    c_loss = F.mse_loss(feats['content'], content_feats['content'])
    s_loss = sum(F.mse_loss(gram_matrix(feats[l]), gram_matrix(style_feats[l]))
                 for l in style_layers)
    loss = c_loss + 1e6 * s_loss
    loss.backward()
    return loss

for _ in range(300):
    optimizer.step(closure)

IP-Adapter (diffusion-based, 2024+)

from diffusers import StableDiffusionPipeline
from ip_adapter import IPAdapter

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5").to("cuda")
ip_model = IPAdapter(pipe, "h94/IP-Adapter", "models/ip-adapter_sd15.bin", "cuda")

style_ref = Image.open("vangogh.jpg")
images = ip_model.generate(pil_image=style_ref, prompt="a cat", num_samples=4, scale=0.7)

Midjourney --sref (2026 mainstream)

/imagine prompt: a serene lake at sunset --sref https://example.com/style.jpg --sw 100

매 결정 기준

상황 Approach
매 일회성 art experiment Gatys (구현 simple, slow ok)
매 real-time / video AdaIN, fast NST
매 production creative IP-Adapter + SDXL / FLUX
매 non-coder creative Midjourney --sref
매 controllable structure + style ControlNet + IP-Adapter combo

기본값: 매 2026 의 IP-Adapter (open) 또는 Midjourney --sref (closed).

🔗 Graph

🤖 LLM 활용

언제: 매 creative pipeline 의 style consistency, brand asset variant 생성, mood board 의 visual exploration. 언제 X: 매 photo retouching (use Lightroom), 매 strict color grading (use LUTs), 매 face identity preservation 의 unstable.

안티패턴

  • Style weight 무한 증가: 매 content 가 사라짐. balance 필수 (1e6 typical).
  • Single VGG layer: 매 multi-scale style 의 lost. 매 multiple layer aggregate.
  • Diffusion 의 prompt 무시: IP-Adapter scale 너무 높으면 prompt 의 무시. scale 0.5-0.8 sweet spot.

🧪 검증 / 중복

  • Verified (Gatys et al. 2015 "A Neural Algorithm of Artistic Style"; Huang & Belongie 2017 AdaIN; Ye et al. 2023 IP-Adapter).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — Gatys → AdaIN → diffusion (IP-Adapter, --sref) coverage