--- id: wiki-2026-0508-초상화-및-애니메이션-스타일-제어 title: 초상화 및 애니메이션 스타일 제어 category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Portrait Style Control, Animation Style Control, Identity-Preserving Generation] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [diffusion, portrait, animation, identity, style-transfer] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: PyTorch/diffusers --- # 초상화 및 애니메이션 스타일 제어 ## 매 한 줄 > **"매 identity 보존 + 매 style 변환의 직교 분리"**. portrait/animation 도메인의 매 핵심 challenge — 같은 사람이 매 frame 마다 같아야 하고 (identity), 매 style/pose 는 자유롭게 (control). 2026 의 매 답: InstantID / PuLID (identity) + IP-Adapter (style) + ControlNet pose (motion) 의 stack. ## 매 핵심 ### 매 3축 분리 - **Identity axis**: 매 face embedding (ArcFace) 으로 lock — InstantID, PuLID - **Style axis**: 매 reference image embedding 으로 modulate — IP-Adapter - **Motion axis**: 매 pose / depth 로 frame structure — OpenPose / DWPose ### 매 animation consistency 기법 - **Reference frame**: 매 첫 frame 을 anchor 로 IP-Adapter 적용 - **Temporal LoRA**: 매 AnimateDiff motion module 로 inter-frame coherence - **Latent warp**: 매 prev frame latent 을 optical flow 로 warp 후 noise add - **Cross-frame attention**: 매 frame 의 attention key/value 를 공유 ### 매 응용 1. Avatar / VTuber pipeline — 매 same face × multi-emotion × multi-outfit. 2. Character sheet generation — 매 turnaround (front/side/back). 3. Short animation — 매 character 의 8-frame walk cycle. ## 💻 패턴 ### InstantID portrait generation ```python from diffusers import StableDiffusionXLInstantIDPipeline from insightface.app import FaceAnalysis import cv2, numpy as np face_app = FaceAnalysis(name="antelopev2", providers=["CUDAExecutionProvider"]) face_app.prepare(ctx_id=0, det_size=(640, 640)) face_img = cv2.imread("ref.jpg") face_info = face_app.get(face_img)[0] face_emb = face_info["embedding"] pipe = StableDiffusionXLInstantIDPipeline.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", controlnet=instantid_controlnet, torch_dtype=torch.float16, ).to("cuda") pipe.load_ip_adapter_instantid("instantid_ip-adapter.bin") out = pipe( prompt="anime portrait, school uniform", image_embeds=face_emb, image=face_kps, # face keypoints ip_adapter_scale=0.8, controlnet_conditioning_scale=0.8, num_inference_steps=30, ).images[0] ``` ### PuLID identity preservation ```python from pulid.pipeline_v1_1 import PuLIDPipeline pulid = PuLIDPipeline() id_emb = pulid.get_id_embedding(["ref1.jpg", "ref2.jpg"]) img = pulid.inference( prompt="cyberpunk character, neon city", id_embedding=id_emb, id_scale=0.9, cfg_scale=1.2, steps=4, # SDXL Lightning )[0] ``` ### IP-Adapter style + face combined ```python pipe.load_ip_adapter( "h94/IP-Adapter", subfolder="sdxl_models", weight_name=["ip-adapter-plus-face_sdxl_vit-h.safetensors", "ip-adapter-plus_sdxl_vit-h.safetensors"], ) pipe.set_ip_adapter_scale([0.7, 0.4]) # face stronger than style img = pipe( prompt="portrait, watercolor style", ip_adapter_image=[face_ref, style_ref], ).images[0] ``` ### AnimateDiff motion generation ```python from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-3") pipe = AnimateDiffPipeline.from_pretrained( "SG161222/Realistic_Vision_V5.1_noVAE", motion_adapter=adapter, torch_dtype=torch.float16, ).to("cuda") frames = pipe( prompt="character walking, side view", num_frames=16, num_inference_steps=25, guidance_scale=7.5, ).frames[0] ``` ### Cross-frame attention sharing ```python def cross_frame_attention(self, x, prev_kv=None): q = self.to_q(x) k, v = self.to_k(x), self.to_v(x) if prev_kv is not None: # 매 prev frame 의 key/value 를 concat k = torch.cat([prev_kv["k"], k], dim=1) v = torch.cat([prev_kv["v"], v], dim=1) out = scaled_dot_product_attention(q, k, v) return self.to_out(out), {"k": k, "v": v} ``` ### Turnaround sheet (multi-pose) ```python poses = ["front view", "3/4 view", "side view", "back view"] turnaround = [] for pose in poses: img = pipe( prompt=f"character portrait, {pose}, neutral expression", image_embeds=face_emb, image=pose_skeleton[pose], controlnet_conditioning_scale=0.9, generator=torch.Generator("cuda").manual_seed(42), # 매 fixed seed ).images[0] turnaround.append(img) ``` ### Emotion variation with locked identity ```python emotions = ["smiling", "angry", "surprised", "sad", "neutral"] for emo in emotions: img = pipe( prompt=f"portrait, {emo} expression", image_embeds=face_emb, ip_adapter_scale=0.85, # 매 identity strong guidance_scale=4.5, generator=torch.Generator("cuda").manual_seed(7), ).images[0] img.save(f"emo_{emo}.png") ``` ## 매 결정 기준 | 목표 | 조합 | |---|---| | Highest face fidelity | PuLID + InstantID + IP-Adapter Face | | Style transfer with face | IP-Adapter Face (0.8) + IP-Adapter Style (0.4) | | Animation, single character | AnimateDiff + reference attention + IP-Adapter | | Game character sheet | InstantID + ControlNet pose × 4 with shared seed | | Real-time avatar | SDXL Lightning / FLUX Schnell + cached identity emb | **기본값**: InstantID + IP-Adapter (style 0.4, face 0.7) + 매 fixed seed for batch. ## 🔗 Graph - 부모: [[이미지 생성 및 제어 파이프라인]] · [[AI 이미지 생성 (AI Image Generation)]] - 변형: [[ComfyUI]] · [[InstantID]] · [[PuLID]] - 응용: [[Avatar_Pipeline]] · [[AI 모델 사후 편집 도구 (Post-editing Tools)]] - Adjacent: [[AnimateDiff]] · [[IP-Adapter]] · [[ControlNet]] ## 🤖 LLM 활용 **언제**: prompt 의 emotion / pose 변형 generation, character sheet plan 작성, style description 추출. **언제 X**: face embedding 의 inner space — geometric, LLM 의 X. ## ❌ 안티패턴 - **No fixed seed in batch**: 매 turnaround 마다 face drift. - **IP-Adapter scale > 1.0**: 매 prompt 무시, reference 의 over-copy. - **Identity + Style conflict**: 매 같은 weight → identity blur. - **Missing pose normalization**: pose skeleton 의 scale 이 prompt 와 불일치. - **AnimateDiff w/o reference**: 매 frame consistency 없는 flicker. ## 🧪 검증 / 중복 - Verified (InstantX InstantID paper 2024, PuLID v1.1 release notes 2025, AnimateDiff v3). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — portrait/animation identity+style control |