Files
2nd/10_Wiki/Topics/AI_and_ML/스테이블 디퓨전을 이용한 오픈소스 기반 정밀 이미지 합성 및 해부학적 오류 수정 파이프라인.md
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

208 lines
8.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-스테이블-디퓨전을-이용한-오픈소스-기반-정밀-이미지-합성-
title: 스테이블 디퓨전 기반 정밀 이미지 합성 및 해부학적 오류 수정 파이프라인
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [SD Anatomy Fix Pipeline, ControlNet Anatomy, AfterDetailer, ADetailer]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [stable-diffusion, controlnet, adetailer, comfyui, anatomy, inpaint]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: python
framework: ComfyUI/diffusers/SDXL/FLUX
---
# 스테이블 디퓨전 기반 정밀 합성 + 해부학적 오류 수정
## 매 한 줄
> **"매 anatomy fix 의 본질 은 detection + inpaint 의 loop"**. 매 SDXL/FLUX base 의 raw output 에서 hand/face/text 의 결함 은 inevitable — 매 ADetailer 의 auto-detect, 매 ControlNet 의 pose/depth lock, 매 second-pass inpaint 가 매 결합되어 production-quality 의 reliable pipeline 의 형성. 매 2026 의 open-source workflow 는 ComfyUI 의 graph + node API 가 standard.
## 매 핵심
### 매 problem space
- **Hand 의 6-finger / fused digits**: SD/SDXL 의 chronic. FLUX.1 dev 에서 의 상당 개선, but 의 not perfect.
- **Face degradation at low pixel**: 매 small face crop 에서 의 detail loss → ADetailer 가 face crop → high-res inpaint.
- **Text 의 illegible**: SDXL 의 weak. FLUX 의 strong (그러나 아직 결함 있음).
- **Eye 의 asymmetry**: gaze direction, pupil size 의 mismatch.
### 매 도구 (open source)
- **ComfyUI**: 매 node-based workflow. 매 reproducible JSON.
- **A1111 / Forge / reForge**: 매 web UI. 매 ADetailer extension 의 default.
- **ControlNet**: pose/depth/canny/openpose/MediaPipeFace 의 conditioning.
- **ADetailer**: face/hand 의 auto-detect → inpaint.
- **MeshGraphormer / DWPose**: 매 hand pose 의 estimation → ControlNet hand_refiner.
- **IP-Adapter FaceID Plus v2**: 매 face consistency.
### 매 pipeline 구성
1. **Base generation** (SDXL/FLUX) — high-level composition.
2. **ADetailer face pass** — face crop → upscale → inpaint with same prompt.
3. **ADetailer hand pass** — DWPose 검출 → ControlNet hand_refiner → inpaint.
4. **Manual touch-up** — 잔여 결함 의 mask + inpaint.
5. **Upscale** — Real-ESRGAN / SUPIR / FLUX-Upscale.
### 매 응용
1. Character art (anime, realistic portrait).
2. Fashion editorial (pose-precise).
3. Comic / manga panel.
4. Game asset (consistent character sheets).
## 💻 패턴
### Pattern 1 — ADetailer pipeline (diffusers)
```python
from diffusers import StableDiffusionXLPipeline, StableDiffusionXLInpaintPipeline
from ultralytics import YOLO
import torch
base = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
).to("cuda")
inpaint = StableDiffusionXLInpaintPipeline.from_pretrained(
"diffusers/stable-diffusion-xl-1.0-inpainting-0.1", torch_dtype=torch.float16
).to("cuda")
face_det = YOLO("face_yolov8n.pt")
def adetailer_face(img, prompt):
boxes = face_det(img)[0].boxes.xyxy.cpu().numpy()
for x1,y1,x2,y2 in boxes:
crop = img.crop((x1,y1,x2,y2)).resize((1024,1024))
mask = make_face_mask(crop)
fixed = inpaint(prompt=prompt, image=crop, mask_image=mask,
num_inference_steps=25, strength=0.5).images[0]
img.paste(fixed.resize((int(x2-x1),int(y2-y1))), (int(x1),int(y1)))
return img
```
### Pattern 2 — Hand fix with DWPose + ControlNet
```python
from controlnet_aux import DWposeDetector
from diffusers import StableDiffusionXLControlNetInpaintPipeline, ControlNetModel
dwpose = DWposeDetector.from_pretrained("yzd-v/DWPose")
hand_cn = ControlNetModel.from_pretrained("hr16/ControlNet-HandRefiner-pruned",
torch_dtype=torch.float16)
def fix_hands(img, prompt):
pose = dwpose(img, hand_only=True) # depth-style hand map
mask = hand_mask_from_dwpose(pose)
pipe = StableDiffusionXLControlNetInpaintPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=hand_cn, torch_dtype=torch.float16
).to("cuda")
return pipe(prompt=prompt, image=img, mask_image=mask,
control_image=pose, num_inference_steps=30,
controlnet_conditioning_scale=0.9).images[0]
```
### Pattern 3 — ControlNet OpenPose lock
```python
from controlnet_aux import OpenposeDetector
from diffusers import StableDiffusionXLControlNetPipeline
op = OpenposeDetector.from_pretrained("lllyasviel/ControlNet")
pose_map = op(reference_image, hand_and_face=True)
cn = ControlNetModel.from_pretrained("xinsir/controlnet-openpose-sdxl-1.0",
torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=cn, torch_dtype=torch.float16
).to("cuda")
out = pipe(prompt="cinematic shot of a woman dancing", image=pose_map,
num_inference_steps=30, guidance_scale=6.5).images[0]
```
### Pattern 4 — Multi-ControlNet (depth + canny + pose)
```python
from diffusers import MultiControlNetModel
controlnets = MultiControlNetModel([
ControlNetModel.from_pretrained("xinsir/controlnet-depth-sdxl-1.0", torch_dtype=torch.float16),
ControlNetModel.from_pretrained("xinsir/controlnet-canny-sdxl-1.0", torch_dtype=torch.float16),
ControlNetModel.from_pretrained("xinsir/controlnet-openpose-sdxl-1.0", torch_dtype=torch.float16),
])
result = pipe(prompt=p,
image=[depth_map, canny_map, pose_map],
controlnet_conditioning_scale=[0.6, 0.4, 0.9]).images[0]
```
### Pattern 5 — Differential diffusion (soft mask)
```python
# Soft strength per-pixel mask: 0.0 keep, 1.0 fully regen
import numpy as np
from PIL import Image
def soft_mask(face_box, img_size, edge_blur=24):
m = np.zeros(img_size, dtype=np.float32)
x1,y1,x2,y2 = face_box
m[y1:y2, x1:x2] = 1.0
from scipy.ndimage import gaussian_filter
return Image.fromarray((gaussian_filter(m, edge_blur)*255).astype(np.uint8))
# Pass into pipe.diff_diffusion_strength=
```
### Pattern 6 — Face restore (CodeFormer / GFPGAN)
```python
from codeformer import CodeFormer
cf = CodeFormer(weight_path="codeformer.pth", device="cuda")
restored = cf.enhance(img_np, w=0.7) # 0=keep id, 1=full repair
```
### Pattern 7 — ComfyUI workflow JSON snippet
```json
{
"1": {"class_type": "KSampler",
"inputs": {"steps": 30, "cfg": 6.5, "sampler_name": "dpmpp_2m_sde",
"scheduler": "karras", "model": ["4",0], "positive":["6",0],
"negative":["7",0], "latent_image":["5",0]}},
"20":{"class_type":"FaceDetailer",
"inputs":{"image":["1",0],"model":["4",0],
"bbox_detector":"face_yolov8m.pt",
"wildcard":"perfect symmetric eyes, sharp focus",
"guide_size":768,"max_size":1024,"steps":20}}
}
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| Anime portrait, hand 결함 | A1111 + ADetailer (face_yolov8n + hand_yolov8n) |
| Realistic full body | SDXL + Multi-ControlNet (pose+depth) |
| Reproducible production | ComfyUI workflow JSON + Git |
| Maximum quality, slow | FLUX.1 dev + Differential Diffusion |
| Real-time iteration | SDXL Lightning / Turbo |
**기본값**: 매 SDXL base + ADetailer face/hand pass + 1 manual inpaint round.
## 🔗 Graph
- 부모: [[AI 이미지 생성 (AI Image Generation)]] · [[Stable Diffusion]]
- 변형: [[ControlNet]] · [[ADetailer]] · [[IP-Adapter]]
- 응용: [[사후 편집 (Post-editing)]] · [[Brand Consistency Maintenance|Character Consistency]]
- Adjacent: [[FLUX]] · [[ComfyUI]]
## 🤖 LLM 활용
**언제**: 매 ComfyUI workflow JSON 의 authoring/debugging, 매 prompt + negative-prompt 의 systematic generation.
**언제 X**: 매 fine pixel-level inpaint 결정 — 매 visual judgment 가 필요.
## ❌ 안티패턴
- **Single-pass at full res**: 매 face/hand 의 detail starvation. 매 second-pass crop+upscale 의 mandatory.
- **Wrong ControlNet 조합**: 매 pose + canny 동시에 high weight → 매 over-constrain → composition 의 collapse.
- **Hand_refiner 없는 SDXL hand**: 매 5% 미만 의 success rate.
- **ADetailer denoise 1.0**: 매 identity 의 destroyed. 매 0.40.6 의 sweet spot.
## 🧪 검증 / 중복
- Verified (Mikubill/sd-webui-controlnet, Bing-su/adetailer; HuggingFace diffusers docs).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — ADetailer + ControlNet hand_refiner pipeline |