2nd/10_Wiki/Topics/AI_and_ML/샘플링 스텝 (Sampling Steps).md

---
id: wiki-2026-0508-샘플링-스텝-sampling-steps
title: 샘플링 스텝 (Sampling Steps)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Sampling Steps, Diffusion Steps, Inference Steps, num_inference_steps]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [diffusion, sampling, sdxl, flux, inference, image-generation]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: python
  framework: diffusers/ComfyUI
---

# 샘플링 스텝 (Sampling Steps)

## 매 한 줄
> **"매 step 은 noise → image 의 길 위 의 한 발자국"**. 매 diffusion model 에서 `num_inference_steps` 는 매 reverse-diffusion ODE 의 discretization count — 매 적으면 빠르지만 muddy, 매 많으면 sharp 지만 expensive 하고 어느 임계점 이상 은 의미 X. 매 2026 의 modern sampler (DPM++ 2M Karras, FLUX 의 Euler) + Lightning/Turbo distillation 으로 매 sweet-spot 이 12–30 steps 로 안정.

## 매 핵심

### 매 정의
- **Sampling step**: 매 reverse diffusion 의 한 iteration. matrix t = T → 0 의 discretization.
- **Sampler / scheduler**: 매 step 사이 noise 의 schedule + 알고리즘 (Euler, DPM, UniPC, LMS, DDIM, DPM++ SDE).
- **CFG (guidance scale)**: 매 step 마다 conditional vs unconditional 의 weighting.
- **Sigma schedule**: 매 noise level 의 t-vs-sigma curve (linear, karras, exponential).

### 매 sampler family (2026 state)
- **Euler / Euler a**: 매 simple, fast, good baseline (15–25).
- **DPM++ 2M Karras**: 매 SDXL community default (20–30). 매 quality leader 의 하나.
- **DPM++ 3M SDE**: 매 detail 강함 (28–40).
- **UniPC**: 매 빠른 convergence (10–20).
- **DDIM**: 매 deterministic, ControlNet 호환.
- **LCM / Turbo / Lightning**: 매 distilled 1–8 steps.
- **FLUX 의 Euler / fm_euler**: 매 flow-matching 형식. 매 25–30 steps default.

### 매 step 수 의 trade-off
- 매 5–10: muddy, oversmooth (distilled 모델 의 경우 ok).
- 매 15–20: 매 production sweet spot (대부분 SDXL / FLUX dev).
- 매 25–30: 매 detail-critical scene.
- 매 40+: 매 returns diminishing. 매 거의 무의미.

### 매 응용 의 결정 요인
1. Latency budget (real-time vs batch).
2. Sampler 의 스타일 (Karras vs exponential).
3. Distillation 의 사용 여부.
4. ControlNet / IP-Adapter 의 noise floor.

## 💻 패턴

### Pattern 1 — diffusers basic
```python
from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
import torch

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
).to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
    pipe.scheduler.config, algorithm_type="dpmsolver++", use_karras_sigmas=True
)

img = pipe(prompt="cinematic portrait, dramatic light",
           num_inference_steps=25, guidance_scale=6.5).images[0]
```

### Pattern 2 — Turbo / Lightning (low-step)
```python
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16"
).to("cuda")
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")

img = pipe(prompt="a wizard casting a fireball",
           num_inference_steps=4, guidance_scale=0.0).images[0]
```

### Pattern 3 — FLUX (flow-matching)
```python
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
).to("cuda")
img = pipe(prompt="forest stream at golden hour",
           num_inference_steps=28, guidance_scale=3.5,
           max_sequence_length=512).images[0]
```

### Pattern 4 — UniPC (fast convergence)
```python
from diffusers import UniPCMultistepScheduler
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
img = pipe(prompt="..", num_inference_steps=12, guidance_scale=6.0).images[0]
```

### Pattern 5 — sweep test
```python
import os, time
prompts = ["a stoic samurai standing in falling snow"]
for steps in (8, 12, 16, 20, 25, 30, 40):
    t0 = time.time()
    out = pipe(prompt=prompts[0], num_inference_steps=steps,
               guidance_scale=6.5, generator=torch.Generator("cuda").manual_seed(42)).images[0]
    out.save(f"sweep_{steps:02d}.png")
    print(f"steps={steps} time={time.time()-t0:.2f}s")
```

### Pattern 6 — Karras sigmas custom
```python
import torch
def karras_sigmas(n, sigma_min=0.029, sigma_max=14.6, rho=7.0):
    ramp = torch.linspace(0, 1, n)
    min_inv = sigma_min ** (1/rho); max_inv = sigma_max ** (1/rho)
    return (max_inv + ramp * (min_inv - max_inv)) ** rho
sigmas = karras_sigmas(20).to("cuda")
pipe.scheduler.set_timesteps(sigmas=sigmas)
```

### Pattern 7 — ControlNet 의 step 협응
```python
# ControlNet 의 conditioning step 시작/종료 비율
img = pipe(prompt=p, image=control_img, num_inference_steps=25,
           controlnet_conditioning_scale=0.9,
           control_guidance_start=0.0,   # 첫 step 부터
           control_guidance_end=0.7      # 70% 지점 에서 중단 → free 마지막 30%
           ).images[0]
```

### Pattern 8 — early-exit (latent preview)
```python
# Inspect mid-process latent
def callback(pipe, step, timestep, callback_kwargs):
    if step == 5:
        latents = callback_kwargs["latents"]
        # decode preview, show, etc.
    return callback_kwargs

pipe(prompt=p, num_inference_steps=20,
     callback_on_step_end=callback,
     callback_on_step_end_tensor_inputs=["latents"])
```

## 매 결정 기준
| 상황 | Steps | Sampler |
|---|---|---|
| SDXL turbo / lightning | 1–4 | Euler |
| LCM | 4–8 | LCM |
| SDXL 일반 | 20–28 | DPM++ 2M Karras |
| SDXL detail-critical | 28–35 | DPM++ 3M SDE |
| FLUX dev | 25–30 | flow-match Euler |
| FLUX schnell (distilled) | 4 | Euler |
| ControlNet inpaint | 25–30 | DPM++ / UniPC |

**기본값**: 매 SDXL → 25 steps DPM++ 2M Karras CFG 6.5. 매 FLUX dev → 28 steps CFG 3.5.

## 🔗 Graph
- 부모: [[Stable Diffusion]] · [[Diffusion Models]]
- 변형: [[CFG 스케일(Classifier-Free Guidance Scale)]]
- 응용: [[AI 이미지 생성 (AI Image Generation)]] · [[사후 편집 (Post-editing)]]
- Adjacent: [[FLUX]] · [[ComfyUI]]

## 🤖 LLM 활용
**언제**: 매 sweep config 의 generation, 매 sampler comparison table 의 작성.
**언제 X**: 매 visual quality 의 final judge — 매 image grid 의 inspection 이 필요.

## ❌ 안티패턴
- **Steps 100**: 매 cost 4× 의 quality gain 거의 0.
- **Distilled model + high steps**: 매 SDXL Turbo 의 30 step → 매 over-burn.
- **Sampler 의 random pick**: 매 prompt-sampler interaction 의 무시.
- **CFG + steps 의 단독 변경**: 매 둘은 결합 — 매 high CFG → 매 더 많은 step 필요.

## 🧪 검증 / 중복
- Verified (k-diffusion repo, diffusers schedulers docs, ComfyUI manager).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — sampler family + step decision matrix |