Files

T

Antigravity Agent 504fd5fb42 [G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00

6.9 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

초상화 및 애니메이션 스타일 제어

매 한 줄

"매 identity 보존 + 매 style 변환의 직교 분리". portrait/animation 도메인의 매 핵심 challenge — 같은 사람이 매 frame 마다 같아야 하고 (identity), 매 style/pose 는 자유롭게 (control). 2026 의 매 답: InstantID / PuLID (identity) + IP-Adapter (style) + ControlNet pose (motion) 의 stack.

매 핵심

매 3축 분리

Identity axis: 매 face embedding (ArcFace) 으로 lock — InstantID, PuLID
Style axis: 매 reference image embedding 으로 modulate — IP-Adapter
Motion axis: 매 pose / depth 로 frame structure — OpenPose / DWPose

매 animation consistency 기법

Reference frame: 매 첫 frame 을 anchor 로 IP-Adapter 적용
Temporal LoRA: 매 AnimateDiff motion module 로 inter-frame coherence
Latent warp: 매 prev frame latent 을 optical flow 로 warp 후 noise add
Cross-frame attention: 매 frame 의 attention key/value 를 공유

매 응용

Avatar / VTuber pipeline — 매 same face × multi-emotion × multi-outfit.
Character sheet generation — 매 turnaround (front/side/back).
Short animation — 매 character 의 8-frame walk cycle.

💻 패턴

InstantID portrait generation

from diffusers import StableDiffusionXLInstantIDPipeline
from insightface.app import FaceAnalysis
import cv2, numpy as np

face_app = FaceAnalysis(name="antelopev2", providers=["CUDAExecutionProvider"])
face_app.prepare(ctx_id=0, det_size=(640, 640))

face_img = cv2.imread("ref.jpg")
face_info = face_app.get(face_img)[0]
face_emb = face_info["embedding"]

pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=instantid_controlnet,
    torch_dtype=torch.float16,
).to("cuda")
pipe.load_ip_adapter_instantid("instantid_ip-adapter.bin")

out = pipe(
    prompt="anime portrait, school uniform",
    image_embeds=face_emb,
    image=face_kps,            # face keypoints
    ip_adapter_scale=0.8,
    controlnet_conditioning_scale=0.8,
    num_inference_steps=30,
).images[0]

PuLID identity preservation

from pulid.pipeline_v1_1 import PuLIDPipeline

pulid = PuLIDPipeline()
id_emb = pulid.get_id_embedding(["ref1.jpg", "ref2.jpg"])

img = pulid.inference(
    prompt="cyberpunk character, neon city",
    id_embedding=id_emb,
    id_scale=0.9,
    cfg_scale=1.2,
    steps=4,                   # SDXL Lightning
)[0]

IP-Adapter style + face combined

pipe.load_ip_adapter(
    "h94/IP-Adapter",
    subfolder="sdxl_models",
    weight_name=["ip-adapter-plus-face_sdxl_vit-h.safetensors",
                 "ip-adapter-plus_sdxl_vit-h.safetensors"],
)
pipe.set_ip_adapter_scale([0.7, 0.4])  # face stronger than style

img = pipe(
    prompt="portrait, watercolor style",
    ip_adapter_image=[face_ref, style_ref],
).images[0]

AnimateDiff motion generation

from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler

adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-3")
pipe = AnimateDiffPipeline.from_pretrained(
    "SG161222/Realistic_Vision_V5.1_noVAE",
    motion_adapter=adapter,
    torch_dtype=torch.float16,
).to("cuda")

frames = pipe(
    prompt="character walking, side view",
    num_frames=16,
    num_inference_steps=25,
    guidance_scale=7.5,
).frames[0]

def cross_frame_attention(self, x, prev_kv=None):
    q = self.to_q(x)
    k, v = self.to_k(x), self.to_v(x)
    if prev_kv is not None:
        # 매 prev frame 의 key/value 를 concat
        k = torch.cat([prev_kv["k"], k], dim=1)
        v = torch.cat([prev_kv["v"], v], dim=1)
    out = scaled_dot_product_attention(q, k, v)
    return self.to_out(out), {"k": k, "v": v}

Turnaround sheet (multi-pose)

poses = ["front view", "3/4 view", "side view", "back view"]
turnaround = []
for pose in poses:
    img = pipe(
        prompt=f"character portrait, {pose}, neutral expression",
        image_embeds=face_emb,
        image=pose_skeleton[pose],
        controlnet_conditioning_scale=0.9,
        generator=torch.Generator("cuda").manual_seed(42),  # 매 fixed seed
    ).images[0]
    turnaround.append(img)

Emotion variation with locked identity

emotions = ["smiling", "angry", "surprised", "sad", "neutral"]
for emo in emotions:
    img = pipe(
        prompt=f"portrait, {emo} expression",
        image_embeds=face_emb,
        ip_adapter_scale=0.85,           # 매 identity strong
        guidance_scale=4.5,
        generator=torch.Generator("cuda").manual_seed(7),
    ).images[0]
    img.save(f"emo_{emo}.png")

매 결정 기준

목표	조합
Highest face fidelity	PuLID + InstantID + IP-Adapter Face
Style transfer with face	IP-Adapter Face (0.8) + IP-Adapter Style (0.4)
Animation, single character	AnimateDiff + reference attention + IP-Adapter
Game character sheet	InstantID + ControlNet pose × 4 with shared seed
Real-time avatar	SDXL Lightning / FLUX Schnell + cached identity emb

기본값: InstantID + IP-Adapter (style 0.4, face 0.7) + 매 fixed seed for batch.

🔗 Graph

부모: 이미지 생성 및 제어 파이프라인 · AI 이미지 생성 (AI Image Generation)
변형: ComfyUI · InstantID · PuLID
응용: Avatar_Pipeline · AI 모델 사후 편집 도구 (Post-editing Tools)
Adjacent: AnimateDiff · IP-Adapter · ControlNet

🤖 LLM 활용

언제: prompt 의 emotion / pose 변형 generation, character sheet plan 작성, style description 추출. 언제 X: face embedding 의 inner space — geometric, LLM 의 X.

❌ 안티패턴

No fixed seed in batch: 매 turnaround 마다 face drift.
IP-Adapter scale > 1.0: 매 prompt 무시, reference 의 over-copy.
Identity + Style conflict: 매 같은 weight → identity blur.
Missing pose normalization: pose skeleton 의 scale 이 prompt 와 불일치.
AnimateDiff w/o reference: 매 frame consistency 없는 flicker.

🧪 검증 / 중복

Verified (InstantX InstantID paper 2024, PuLID v1.1 release notes 2025, AnimateDiff v3).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — portrait/animation identity+style control

6.9 KiB Raw Blame History Unescape Escape