6.9 KiB
6.9 KiB
id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
| id | title | category | status | canonical_id | aliases | duplicate_of | source_trust_level | confidence_score | verification_status | tags | raw_sources | last_reinforced | github_commit | tech_stack | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wiki-2026-0508-초상화-및-애니메이션-스타일-제어 | 초상화 및 애니메이션 스타일 제어 | 10_Wiki/Topics | verified | self |
|
none | A | 0.9 | applied |
|
2026-05-10 | pending |
|
초상화 및 애니메이션 스타일 제어
매 한 줄
"매 identity 보존 + 매 style 변환의 직교 분리". portrait/animation 도메인의 매 핵심 challenge — 같은 사람이 매 frame 마다 같아야 하고 (identity), 매 style/pose 는 자유롭게 (control). 2026 의 매 답: InstantID / PuLID (identity) + IP-Adapter (style) + ControlNet pose (motion) 의 stack.
매 핵심
매 3축 분리
- Identity axis: 매 face embedding (ArcFace) 으로 lock — InstantID, PuLID
- Style axis: 매 reference image embedding 으로 modulate — IP-Adapter
- Motion axis: 매 pose / depth 로 frame structure — OpenPose / DWPose
매 animation consistency 기법
- Reference frame: 매 첫 frame 을 anchor 로 IP-Adapter 적용
- Temporal LoRA: 매 AnimateDiff motion module 로 inter-frame coherence
- Latent warp: 매 prev frame latent 을 optical flow 로 warp 후 noise add
- Cross-frame attention: 매 frame 의 attention key/value 를 공유
매 응용
- Avatar / VTuber pipeline — 매 same face × multi-emotion × multi-outfit.
- Character sheet generation — 매 turnaround (front/side/back).
- Short animation — 매 character 의 8-frame walk cycle.
💻 패턴
InstantID portrait generation
from diffusers import StableDiffusionXLInstantIDPipeline
from insightface.app import FaceAnalysis
import cv2, numpy as np
face_app = FaceAnalysis(name="antelopev2", providers=["CUDAExecutionProvider"])
face_app.prepare(ctx_id=0, det_size=(640, 640))
face_img = cv2.imread("ref.jpg")
face_info = face_app.get(face_img)[0]
face_emb = face_info["embedding"]
pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=instantid_controlnet,
torch_dtype=torch.float16,
).to("cuda")
pipe.load_ip_adapter_instantid("instantid_ip-adapter.bin")
out = pipe(
prompt="anime portrait, school uniform",
image_embeds=face_emb,
image=face_kps, # face keypoints
ip_adapter_scale=0.8,
controlnet_conditioning_scale=0.8,
num_inference_steps=30,
).images[0]
PuLID identity preservation
from pulid.pipeline_v1_1 import PuLIDPipeline
pulid = PuLIDPipeline()
id_emb = pulid.get_id_embedding(["ref1.jpg", "ref2.jpg"])
img = pulid.inference(
prompt="cyberpunk character, neon city",
id_embedding=id_emb,
id_scale=0.9,
cfg_scale=1.2,
steps=4, # SDXL Lightning
)[0]
IP-Adapter style + face combined
pipe.load_ip_adapter(
"h94/IP-Adapter",
subfolder="sdxl_models",
weight_name=["ip-adapter-plus-face_sdxl_vit-h.safetensors",
"ip-adapter-plus_sdxl_vit-h.safetensors"],
)
pipe.set_ip_adapter_scale([0.7, 0.4]) # face stronger than style
img = pipe(
prompt="portrait, watercolor style",
ip_adapter_image=[face_ref, style_ref],
).images[0]
AnimateDiff motion generation
from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-3")
pipe = AnimateDiffPipeline.from_pretrained(
"SG161222/Realistic_Vision_V5.1_noVAE",
motion_adapter=adapter,
torch_dtype=torch.float16,
).to("cuda")
frames = pipe(
prompt="character walking, side view",
num_frames=16,
num_inference_steps=25,
guidance_scale=7.5,
).frames[0]
Cross-frame attention sharing
def cross_frame_attention(self, x, prev_kv=None):
q = self.to_q(x)
k, v = self.to_k(x), self.to_v(x)
if prev_kv is not None:
# 매 prev frame 의 key/value 를 concat
k = torch.cat([prev_kv["k"], k], dim=1)
v = torch.cat([prev_kv["v"], v], dim=1)
out = scaled_dot_product_attention(q, k, v)
return self.to_out(out), {"k": k, "v": v}
Turnaround sheet (multi-pose)
poses = ["front view", "3/4 view", "side view", "back view"]
turnaround = []
for pose in poses:
img = pipe(
prompt=f"character portrait, {pose}, neutral expression",
image_embeds=face_emb,
image=pose_skeleton[pose],
controlnet_conditioning_scale=0.9,
generator=torch.Generator("cuda").manual_seed(42), # 매 fixed seed
).images[0]
turnaround.append(img)
Emotion variation with locked identity
emotions = ["smiling", "angry", "surprised", "sad", "neutral"]
for emo in emotions:
img = pipe(
prompt=f"portrait, {emo} expression",
image_embeds=face_emb,
ip_adapter_scale=0.85, # 매 identity strong
guidance_scale=4.5,
generator=torch.Generator("cuda").manual_seed(7),
).images[0]
img.save(f"emo_{emo}.png")
매 결정 기준
| 목표 | 조합 |
|---|---|
| Highest face fidelity | PuLID + InstantID + IP-Adapter Face |
| Style transfer with face | IP-Adapter Face (0.8) + IP-Adapter Style (0.4) |
| Animation, single character | AnimateDiff + reference attention + IP-Adapter |
| Game character sheet | InstantID + ControlNet pose × 4 with shared seed |
| Real-time avatar | SDXL Lightning / FLUX Schnell + cached identity emb |
기본값: InstantID + IP-Adapter (style 0.4, face 0.7) + 매 fixed seed for batch.
🔗 Graph
- 부모: 이미지 생성 및 제어 파이프라인 · AI 이미지 생성 (AI Image Generation)
- 변형: ComfyUI · InstantID · PuLID
- 응용: Avatar_Pipeline · AI 모델 사후 편집 도구 (Post-editing Tools)
- Adjacent: AnimateDiff · IP-Adapter · ControlNet
🤖 LLM 활용
언제: prompt 의 emotion / pose 변형 generation, character sheet plan 작성, style description 추출. 언제 X: face embedding 의 inner space — geometric, LLM 의 X.
❌ 안티패턴
- No fixed seed in batch: 매 turnaround 마다 face drift.
- IP-Adapter scale > 1.0: 매 prompt 무시, reference 의 over-copy.
- Identity + Style conflict: 매 같은 weight → identity blur.
- Missing pose normalization: pose skeleton 의 scale 이 prompt 와 불일치.
- AnimateDiff w/o reference: 매 frame consistency 없는 flicker.
🧪 검증 / 중복
- Verified (InstantX InstantID paper 2024, PuLID v1.1 release notes 2025, AnimateDiff v3).
- 신뢰도 A.
🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — portrait/animation identity+style control |