--- id: wiki-2026-0508-이미지-생성-및-제어-파이프라인 title: 이미지 생성 및 제어 파이프라인 category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Image Generation Pipeline, Controlled Diffusion Pipeline, ControlNet Pipeline] duplicate_of: none source_trust_level: A confidence_score: 0.92 verification_status: applied tags: [diffusion, image-gen, controlnet, flux, comfyui] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: PyTorch/diffusers/ComfyUI --- # 이미지 생성 및 제어 파이프라인 ## 매 한 줄 > **"매 control 은 conditioning 의 stack"**. 2026 image gen pipeline 은 base model (FLUX.1 / SDXL / SD3.5) → control adapter (ControlNet / IP-Adapter / T2I-Adapter) → LoRA → refiner 의 layered conditioning. ComfyUI 는 매 node graph 로 이를 explicit, diffusers 는 매 pipeline class 로 abstraction. ## 매 핵심 ### 매 pipeline 단계 - **Prompt encoding**: T5 + CLIP encoder, dual conditioning - **Latent init**: noise 또는 img2img latent - **Conditioning injection**: ControlNet (structure), IP-Adapter (style ref), LoRA (concept) - **Sampling**: Euler / DPM-Solver++ / Flow matching, 20-50 steps - **Decoding**: VAE → pixel space, optional refiner ### 매 control modality - **Structure**: canny, depth, pose, segmentation — 매 spatial constraint - **Identity**: IP-Adapter Face, InstantID, PuLID — 매 face preservation - **Style**: IP-Adapter Style, style-LoRA — 매 reference style - **Concept**: textual inversion, custom LoRA — 매 specific subject ### 매 응용 1. Product photography 의 매 batch generation (sku × pose × bg). 2. Game asset pipeline — 매 concept → portrait → animation pose 일관성. 3. UI/UX prototyping — 매 wireframe-to-mockup conversion. ## 💻 패턴 ### diffusers FLUX + ControlNet ```python import torch from diffusers import FluxControlNetPipeline, FluxControlNetModel controlnet = FluxControlNetModel.from_pretrained( "InstantX/FLUX.1-dev-Controlnet-Canny", torch_dtype=torch.bfloat16, ) pipe = FluxControlNetPipeline.from_pretrained( "black-forest-labs/FLUX.1-dev", controlnet=controlnet, torch_dtype=torch.bfloat16, ).to("cuda") image = pipe( prompt="cyberpunk samurai, neon rain", control_image=canny_image, controlnet_conditioning_scale=0.7, num_inference_steps=28, guidance_scale=3.5, ).images[0] ``` ### Multi-ControlNet stacking ```python from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel cn_pose = ControlNetModel.from_pretrained("xinsir/controlnet-openpose-sdxl-1.0") cn_depth = ControlNetModel.from_pretrained("diffusers/controlnet-depth-sdxl-1.0") pipe = StableDiffusionXLControlNetPipeline.from_pretrained( "stabilityai/stable-diffusion-xl-base-1.0", controlnet=[cn_pose, cn_depth], torch_dtype=torch.float16, ).to("cuda") result = pipe( prompt="warrior pose, mountain backdrop", image=[pose_img, depth_img], controlnet_conditioning_scale=[0.8, 0.5], num_inference_steps=30, ).images[0] ``` ### IP-Adapter style transfer ```python pipe.load_ip_adapter( "h94/IP-Adapter", subfolder="sdxl_models", weight_name="ip-adapter-plus_sdxl_vit-h.safetensors", ) pipe.set_ip_adapter_scale(0.6) out = pipe( prompt="portrait of a knight", ip_adapter_image=style_reference, num_inference_steps=30, ).images[0] ``` ### LoRA composition ```python pipe.load_lora_weights("lora_pack/", weight_name="anime_style.safetensors", adapter_name="anime") pipe.load_lora_weights("lora_pack/", weight_name="my_character.safetensors", adapter_name="char") pipe.set_adapters(["anime", "char"], adapter_weights=[0.7, 0.9]) img = pipe(prompt="my character in anime style, school uniform").images[0] ``` ### Img2img refinement ```python from diffusers import AutoPipelineForImage2Image refiner = AutoPipelineForImage2Image.from_pretrained( "stabilityai/stable-diffusion-xl-refiner-1.0", torch_dtype=torch.float16, ).to("cuda") refined = refiner( prompt=prompt, image=base_image, strength=0.3, num_inference_steps=20, ).images[0] ``` ### ComfyUI API workflow ```python import json, urllib.request workflow = json.load(open("workflows/portrait_pipeline.json")) workflow["6"]["inputs"]["text"] = "cyberpunk samurai" workflow["12"]["inputs"]["seed"] = 12345 req = urllib.request.Request( "http://127.0.0.1:8188/prompt", data=json.dumps({"prompt": workflow}).encode(), headers={"Content-Type": "application/json"}, ) resp = urllib.request.urlopen(req).read() print(resp) ``` ### Batch pipeline with caching ```python from functools import lru_cache @lru_cache(maxsize=8) def encode_prompt(prompt: str): return pipe.encode_prompt(prompt, device="cuda") def generate_batch(prompts: list[str], control_imgs: list, seeds: list[int]): results = [] for p, c, s in zip(prompts, control_imgs, seeds): embeds = encode_prompt(p) gen = torch.Generator("cuda").manual_seed(s) img = pipe( prompt_embeds=embeds[0], pooled_prompt_embeds=embeds[1], control_image=c, generator=gen, ).images[0] results.append(img) return results ``` ## 매 결정 기준 | 상황 | Pipeline | |---|---| | Highest fidelity, slow | FLUX.1-dev + ControlNet + refiner | | Real-time / interactive | SDXL Turbo / FLUX Schnell, 4-8 steps | | Face consistency | InstantID / PuLID + IP-Adapter Face | | Style consistency batch | Style-LoRA + fixed seed offset | | Local-only (Apple Silicon) | MLX + SDXL or DrawThings, FLUX.1 quantized | **기본값**: FLUX.1-dev + 1 ControlNet (canny/depth) + IP-Adapter, 28 steps, guidance 3.5. ## 🔗 Graph - 부모: [[AI 이미지 생성 (AI Image Generation)]] · [[Diffusion_Models]] - 변형: [[초상화 및 애니메이션 스타일 제어]] · [[ComfyUI]] - 응용: [[AI 이미지 생성 및 편집 워크플로우 (AI Image Generation & Editing Workflow)]] · [[AI 이미지 품질 최적화 및 디버깅 (Image Quality Optimization & Debugging)]] - Adjacent: [[ControlNet]] · [[LoRA]] · [[FLUX]] ## 🤖 LLM 활용 **언제**: prompt rewriting, control image 의 caption 추출, workflow JSON 생성, error diagnosis. **언제 X**: VAE/UNet 의 inner forward — 매 결정론적, LLM 의 X. ## ❌ 안티패턴 - **Conditioning over-stack**: 매 5+ control 동시 — 매 conflict, blurry output. - **CFG too high (>7 on FLUX)**: oversaturated, plastic. - **LoRA stacking without weight tuning**: 매 incompatible concept blend. - **Missing seed control**: 매 batch 마다 random — 재현성 손실. - **VAE mismatch**: 매 model VAE 와 다른 VAE 사용 → color shift. ## 🧪 검증 / 중복 - Verified (diffusers 0.30+, ComfyUI 2026-04, FLUX.1 model card). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — image gen pipeline + control modalities |