"매 visual identity 의 generation 의 lock". 매 prompt 만 의 X — 매 reference image (sref/cref/oref) + 매 LoRA + 매 IP-Adapter 의 결합. 매 marketing campaign / product line / character series 의 essential. 매 modern 의 single-image train 의 가능.
📖 핵심
매 dimension
Visual style (sref): 매 color, lighting, texture.
Character (cref): 매 person identity.
Object (oref / IP-Adapter): 매 specific product.
Composition: 매 layout, 매 angle.
Typography: 매 font, 매 logo.
Mood: 매 emotion, atmosphere.
Midjourney 의 reference param
--sref (Style Reference): 매 image / moodboard 의 style.
--cref (Character Reference): 매 character identity.
--oref (Omni Reference, V7): 매 specific object 의 form.
--sw (style weight): 매 0-1000.
--cw (character weight): 매 0-100.
Stable Diffusion / Flux 의 tool
IP-Adapter
매 image prompt → 매 conditioning.
매 SDXL / Flux 지원.
매 face / object / style.
ControlNet
매 pose, depth, edge 의 guide.
매 character pose 의 control.
LoRA (custom)
매 specific identity 의 학습.
매 5-10 image 만 으로.
매 portable (50 MB).
Textual Inversion / Dreambooth
매 token / model 의 fine-tune.
매 expensive 가 매 high quality.
InstantID / PhotoMaker
매 single face image 의 instant clone.
매 fine-tune X.
매 best practice
Reference set first: 매 3-5 brand-safe image.
Single style reference: 매 multiple 의 confusion.
Low stylize (--stylize 0-50): 매 product clarity.
Don't mix everything: 매 sref + cref + oref 의 동시 의 careful.
Iterate from draft: 매 weak first → 매 refine.
Document the recipe: 매 reproducible.
매 modern workflow
Phase 1: 매 brand asset (logo, color palette, style guide).
Phase 2: 매 reference selection.
Phase 3: 매 LoRA / IP-Adapter / sref.
Phase 4: 매 batch generation.
Phase 5: 매 human selection + manual refine.
Phase 6: 매 brand approval.
매 use case
Marketing campaign: 매 ad set.
Product line: 매 catalog.
Character series: 매 mascot, 매 graphic novel.
E-commerce: 매 model 의 다양한 angle.
Storyboard: 매 film pre-vis.
Game asset: 매 NPC variation.
💻 패턴
Midjourney sref + cref
/imagine A futuristic city at night, neon reflections, rain --sref https://my-cdn/style1.jpg --cref https://my-cdn/character.jpg --sw 200 --cw 80 --ar 16:9 --stylize 100
fromdiffusersimportStableDiffusionXLPipeline,AutoencoderKLfromPILimportImageimporttorchpipe=StableDiffusionXLPipeline.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0',torch_dtype=torch.float16,).to('cuda')# 매 IP-Adapter 의 loadpipe.load_ip_adapter('h94/IP-Adapter',subfolder='sdxl_models',weight_name='ip-adapter_sdxl.bin')pipe.set_ip_adapter_scale(0.6)ref_image=Image.open('brand_style.jpg')result=pipe(prompt='a product photo, studio lighting',ip_adapter_image=ref_image,num_inference_steps=30,guidance_scale=7,).images[0]
LoRA training (Kohya / Diffusers)
fromdiffusersimportDDPMScheduler,AutoencoderKL,UNet2DConditionModelfrompeftimportLoraConfig# 매 5-10 image (브랜드 character)training_data=['brand_char_01.jpg',...,'brand_char_10.jpg']# 매 LoRA configlora_config=LoraConfig(r=16,lora_alpha=16,target_modules=['to_q','to_k','to_v','to_out.0'],init_lora_weights='gaussian',)# 매 train (단순화)unet.add_adapter(lora_config)# ... train loop ...unet.save_pretrained('./brand-character-lora')
Character consistency (multi-shot)
# 매 LoRA 로 학습 한 character 의 다양한 scene 의 generateprompts=["<lora:brand_char:0.8> portrait of mascot, smiling, office background","<lora:brand_char:0.8> mascot waving, beach background, sunset","<lora:brand_char:0.8> mascot at desk, laptop, focused",]results=[pipe(p,num_inference_steps=30).images[0]forpinprompts]
InstantID (face cloning)
fromdiffusersimportStableDiffusionXLInstantIDPipelinepipe=StableDiffusionXLInstantIDPipeline.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0',).to('cuda')pipe.load_instantid('InstantX/InstantID')face=Image.open('brand_ambassador.jpg')faceid_embeds,face_kps=extract_face(face)result=pipe(prompt='in a luxury hotel, evening',image_embeds=faceid_embeds,image=face_kps,num_inference_steps=30,).images[0]
Brand prompt template
BRAND_STYLE="""
{subject},
brand: ACME corp,
style: minimalist, white background, soft natural light,
color palette: navy blue, off-white, warm gold accent,
composition: rule of thirds, centered subject,
typography (if any): sans-serif, geometric,
quality: 4k, professional photography
"""defgenerate_brand(subject):returnpipe(BRAND_STYLE.format(subject=subject),guidance_scale=7).images[0]
fromPILimportImageimporttorchdefbrand_consistency_check(reference,generated,threshold=0.7):"""매 CLIP 의 similarity 의 measure."""fromtransformersimportCLIPProcessor,CLIPModelmodel=CLIPModel.from_pretrained('openai/clip-vit-base-patch32')proc=CLIPProcessor.from_pretrained('openai/clip-vit-base-patch32')inputs=proc(images=[reference,generated],return_tensors='pt')embeds=model.get_image_features(**inputs)sim=torch.cosine_similarity(embeds[0:1],embeds[1:2]).item()returnsim,sim>=threshold
🤔 결정 기준
상황
Tool
Quick brand iteration
Midjourney --sref
Full control
SD + ComfyUI + IP-Adapter
Single character
LoRA (5-10 image)
Single face
InstantID / PhotoMaker
Specific object
Omni Reference / Dreambooth
Multiple variations
LoRA + prompt template
Studio production
LoRA + ControlNet pose
기본값: 매 sref / IP-Adapter 의 baseline. 매 character = LoRA. 매 face = InstantID.