Files

T

Antigravity Agent 504fd5fb42 [G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00

8.5 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Brand Consistency in AI Image Generation

📌 한 줄 통찰

"매 visual identity 의 generation 의 lock". 매 prompt 만 의 X — 매 reference image (sref/cref/oref) + 매 LoRA + 매 IP-Adapter 의 결합. 매 marketing campaign / product line / character series 의 essential. 매 modern 의 single-image train 의 가능.

📖 핵심

매 dimension

Visual style (sref): 매 color, lighting, texture.
Character (cref): 매 person identity.
Object (oref / IP-Adapter): 매 specific product.
Composition: 매 layout, 매 angle.
Typography: 매 font, 매 logo.
Mood: 매 emotion, atmosphere.

Midjourney 의 reference param

--sref (Style Reference): 매 image / moodboard 의 style.
--cref (Character Reference): 매 character identity.
--oref (Omni Reference, V7): 매 specific object 의 form.
--sw (style weight): 매 0-1000.
--cw (character weight): 매 0-100.

Stable Diffusion / Flux 의 tool

IP-Adapter

매 image prompt → 매 conditioning.
매 SDXL / Flux 지원.
매 face / object / style.

ControlNet

매 pose, depth, edge 의 guide.
매 character pose 의 control.

LoRA (custom)

매 specific identity 의 학습.
매 5-10 image 만 으로.
매 portable (50 MB).

Textual Inversion / Dreambooth

매 token / model 의 fine-tune.
매 expensive 가 매 high quality.

InstantID / PhotoMaker

매 single face image 의 instant clone.
매 fine-tune X.

매 best practice

Reference set first: 매 3-5 brand-safe image.
Single style reference: 매 multiple 의 confusion.
Low stylize (--stylize 0-50): 매 product clarity.
Don't mix everything: 매 sref + cref + oref 의 동시 의 careful.
Iterate from draft: 매 weak first → 매 refine.
Document the recipe: 매 reproducible.

매 modern workflow

Phase 1: 매 brand asset (logo, color palette, style guide).
Phase 2: 매 reference selection.
Phase 3: 매 LoRA / IP-Adapter / sref.
Phase 4: 매 batch generation.
Phase 5: 매 human selection + manual refine.
Phase 6: 매 brand approval.

매 use case

Marketing campaign: 매 ad set.
Product line: 매 catalog.
Character series: 매 mascot, 매 graphic novel.
E-commerce: 매 model 의 다양한 angle.
Storyboard: 매 film pre-vis.
Game asset: 매 NPC variation.

💻 패턴

Midjourney sref + cref

/imagine A futuristic city at night, neon reflections, rain --sref https://my-cdn/style1.jpg --cref https://my-cdn/character.jpg --sw 200 --cw 80 --ar 16:9 --stylize 100

Stable Diffusion + IP-Adapter (ComfyUI / Diffusers)

from diffusers import StableDiffusionXLPipeline, AutoencoderKL
from PIL import Image
import torch

pipe = StableDiffusionXLPipeline.from_pretrained(
    'stabilityai/stable-diffusion-xl-base-1.0',
    torch_dtype=torch.float16,
).to('cuda')

# 매 IP-Adapter 의 load
pipe.load_ip_adapter('h94/IP-Adapter', subfolder='sdxl_models', weight_name='ip-adapter_sdxl.bin')
pipe.set_ip_adapter_scale(0.6)

ref_image = Image.open('brand_style.jpg')
result = pipe(
    prompt='a product photo, studio lighting',
    ip_adapter_image=ref_image,
    num_inference_steps=30,
    guidance_scale=7,
).images[0]

LoRA training (Kohya / Diffusers)

from diffusers import DDPMScheduler, AutoencoderKL, UNet2DConditionModel
from peft import LoraConfig

# 매 5-10 image (브랜드 character)
training_data = ['brand_char_01.jpg', ..., 'brand_char_10.jpg']

# 매 LoRA config
lora_config = LoraConfig(
    r=16, lora_alpha=16,
    target_modules=['to_q', 'to_k', 'to_v', 'to_out.0'],
    init_lora_weights='gaussian',
)

# 매 train (단순화)
unet.add_adapter(lora_config)
# ... train loop ...
unet.save_pretrained('./brand-character-lora')

Character consistency (multi-shot)

# 매 LoRA 로 학습 한 character 의 다양한 scene 의 generate
prompts = [
    "<lora:brand_char:0.8> portrait of mascot, smiling, office background",
    "<lora:brand_char:0.8> mascot waving, beach background, sunset",
    "<lora:brand_char:0.8> mascot at desk, laptop, focused",
]

results = [pipe(p, num_inference_steps=30).images[0] for p in prompts]

InstantID (face cloning)

from diffusers import StableDiffusionXLInstantIDPipeline

pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    'stabilityai/stable-diffusion-xl-base-1.0',
).to('cuda')
pipe.load_instantid('InstantX/InstantID')

face = Image.open('brand_ambassador.jpg')
faceid_embeds, face_kps = extract_face(face)

result = pipe(
    prompt='in a luxury hotel, evening',
    image_embeds=faceid_embeds,
    image=face_kps,
    num_inference_steps=30,
).images[0]

Brand prompt template

BRAND_STYLE = """
{subject}, 
brand: ACME corp, 
style: minimalist, white background, soft natural light,
color palette: navy blue, off-white, warm gold accent,
composition: rule of thirds, centered subject,
typography (if any): sans-serif, geometric,
quality: 4k, professional photography
"""

def generate_brand(subject):
    return pipe(BRAND_STYLE.format(subject=subject), guidance_scale=7).images[0]

Style guide YAML (recipe lock)

brand: ACME
version: 1.2
style_reference: cdn://acme/style/v2.jpg
sref_weight: 200
character_reference: cdn://acme/mascot/v3.png
cref_weight: 80
stylize: 100
aspect_ratio: 16:9
negative_prompt: "blurry, low quality, watermark, deformed"
loras:
  - name: brand-char
    weight: 0.8
ip_adapter_scale: 0.6

Quality check (auto)

from PIL import Image
import torch

def brand_consistency_check(reference, generated, threshold=0.7):
    """매 CLIP 의 similarity 의 measure."""
    from transformers import CLIPProcessor, CLIPModel
    model = CLIPModel.from_pretrained('openai/clip-vit-base-patch32')
    proc = CLIPProcessor.from_pretrained('openai/clip-vit-base-patch32')
    
    inputs = proc(images=[reference, generated], return_tensors='pt')
    embeds = model.get_image_features(**inputs)
    sim = torch.cosine_similarity(embeds[0:1], embeds[1:2]).item()
    return sim, sim >= threshold

🤔 결정 기준

상황	Tool
Quick brand iteration	Midjourney `--sref`
Full control	SD + ComfyUI + IP-Adapter
Single character	LoRA (5-10 image)
Single face	InstantID / PhotoMaker
Specific object	Omni Reference / Dreambooth
Multiple variations	LoRA + prompt template
Studio production	LoRA + ControlNet pose

기본값: 매 sref / IP-Adapter 의 baseline. 매 character = LoRA. 매 face = InstantID.

🔗 Graph

부모: AI-Image-Generation · Branding · Marketing
변형: Style-Reference · Character-Reference · IP-Adapter · ControlNet
응용: LoRA · InstantID · Dreambooth · PhotoMaker · Midjourney
Adjacent: CFG 스케일(Classifier-Free Guidance Scale) · Stable-Diffusion · Flux · Authenticity · Arts

🤖 LLM 활용

언제: 매 brand image campaign. 매 product catalog. 매 character series. 매 storyboard. 언제 X: 매 unique 매 single image. 매 random creative explore.

❌ 안티패턴

모든 reference 의 동시 의 max weight: 매 visual chaos.
No reference set: 매 drift.
Stylize too high (product): 매 product 의 distort.
Mix multiple LoRA without test: 매 conflict.
No quality check (CLIP): 매 silent drift.
Recipe 의 document X: 매 reproducibility X.

🧪 검증 / 중복

Verified (Midjourney docs, IP-Adapter paper, InstantID).
신뢰도 B.
Related: CFG 스케일(Classifier-Free Guidance Scale) · AI-Image-Generation · LoRA · Authenticity · Stable-Diffusion.

🕓 Changelog

날짜	변경
2026-04-30	Auto-mapped
2026-05-08	Phase 1
2026-05-10	Manual cleanup — sref/cref/oref + LoRA + IP-Adapter + 매 SDXL / Midjourney code

8.5 KiB Raw Blame History