Files
2nd/10_Wiki/Topics/AI_and_ML/Prompt Weight.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

184 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-prompt-weight
title: Prompt Weight
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Prompt Weighting, Attention Weighting, Token Emphasis, Prompt Strength]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [prompt-engineering, generative-ai, stable-diffusion, midjourney, image-gen]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: Python
framework: diffusers / ComfyUI / Automatic1111 / Midjourney
---
# Prompt Weight
## 매 한 줄
> **"매 emphasize / de-emphasize specific tokens in a prompt — `(word:1.3)` syntax of Stable Diffusion, `--w` of Midjourney, attention scaling under the hood"**. AUTOMATIC1111 (2022) 의 prompt-weight syntax 가 community standard 로 자리잡음. 2026 currently FLUX, SD3.5, SDXL Turbo, Midjourney v7 모두 weighting 지원; T5-encoded models 는 syntax 가 다름.
## 매 핵심
### 매 syntax (Stable Diffusion / A1111 / ComfyUI)
- `(word)` — weight ×1.1.
- `((word))`×1.21.
- `(word:1.3)` — explicit weight ×1.3.
- `[word]` — weight ÷1.1.
- `[word:0.5]` — weight ×0.5.
- `(red hair:1.4) (blue eyes:0.8)` — phrase-level.
### 매 syntax (Midjourney v7)
- `cat dog` — equal weight.
- `cat::2 dog::1` — double-colon multi-prompt with weights.
- `--w 0.5` — image weight (text vs reference image).
- `--s 250` — stylize strength.
### 매 syntax (FLUX / T5-encoded)
- T5 understands natural language; `(word:1.3)` syntax 매 mostly ignored.
- Use **emphasis via wording**: "very prominent X", "subtle hint of Y".
- Some forks (forge, ComfyUI) 매 still parse weights via re-prompting.
### 매 mechanism (under the hood)
- CLIP/T5 text encoder → token embeddings.
- A1111: weight w → multiply token embedding by w (post-encoding rescale).
- Compel library: more sophisticated — interpolates between conditioning vectors.
- Cross-attention scaling: alternative — scale K/V at attention layer.
### 매 best practices
- Stay between 0.5 and 1.5; 매 above 1.5 → distortion / saturation.
- Negative prompts often more effective than `[word]` syntax.
- Long prompts: weight the critical 3-5 tokens, leave rest at 1.0.
- For T5 models, use natural-language emphasis instead.
## 💻 패턴
### diffusers + Compel (programmatic weighting)
```python
from diffusers import StableDiffusionXLPipeline
from compel import Compel, ReturnedEmbeddingsType
import torch
pipe = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
).to("cuda")
compel = Compel(
tokenizer=[pipe.tokenizer, pipe.tokenizer_2],
text_encoder=[pipe.text_encoder, pipe.text_encoder_2],
returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
requires_pooled=[False, True],
)
prompt = "a (red:1.4) sports car on a (sunny:0.7) beach, cinematic"
conditioning, pooled = compel(prompt)
image = pipe(prompt_embeds=conditioning, pooled_prompt_embeds=pooled).images[0]
```
### A1111-style parsing (manual)
```python
import re
def parse_weighted(prompt):
"""Return list of (text, weight) tuples."""
out, depth_paren, depth_brack = [], 0, 0
# Simplified: handles (text:1.3) only
pattern = re.compile(r"\(([^():]+):([\d.]+)\)")
parts, last = [], 0
for m in pattern.finditer(prompt):
if m.start() > last:
parts.append((prompt[last:m.start()], 1.0))
parts.append((m.group(1), float(m.group(2))))
last = m.end()
if last < len(prompt):
parts.append((prompt[last:], 1.0))
return parts
```
### Cross-attention scaling (Hugging Face)
```python
# Scale a specific token's attention by factor
from diffusers.models.attention_processor import AttnProcessor
class WeightedAttn(AttnProcessor):
def __init__(self, token_idx, scale):
self.token_idx, self.scale = token_idx, scale
def __call__(self, attn, hidden, encoder_hidden, attention_mask=None):
# In encoder_hidden, multiply token_idx slot by scale before attn
encoder_hidden = encoder_hidden.clone()
encoder_hidden[:, self.token_idx] *= self.scale
return super().__call__(attn, hidden, encoder_hidden, attention_mask)
```
### Midjourney prompt
```
masterpiece anime girl::3 cyberpunk city background::1 neon lights::0.5
--ar 16:9 --s 500 --v 7
```
### FLUX-style natural-language emphasis (no syntax)
```python
# Bad (FLUX ignores): "(red hair:1.5) girl"
# Good: "girl with strikingly vivid red hair, the red is the most prominent color in the image"
```
### Prompt-blending (interpolate two prompts)
```python
from compel import Compel
c1 = compel("a cat in a forest")
c2 = compel("a robot in a city")
mixed = (c1 + c2) / 2 # Compel supports tensor arithmetic
image = pipe(prompt_embeds=mixed).images[0]
```
### Step-conditional weighting (`[from:to:step]`)
```
[cat:dog:0.5] in a field
# 0-50% steps: "cat", 50-100%: "dog"
# Useful for changing subject mid-denoising
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| SDXL / SD1.5 / SD2.1 | A1111 `(word:1.3)` syntax via Compel |
| FLUX / SD3.5 (T5) | Natural-language emphasis |
| Midjourney v7 | `::weight` syntax |
| Subject + style mix | Multi-prompt with `::` or compel blends |
| Subtle adjustment | 0.8-1.2 range |
| Strong push | 1.3-1.5; rarely above |
| Suppress concept | Negative prompt (preferred) over `[word]` |
**기본값**: Compel for SDXL programmatic; A1111 syntax for casual; natural language for FLUX.
## 🔗 Graph
- 부모: [[Prompt_Engineering|Prompt-Engineering]] · [[AI 이미지 생성 (AI Image Generation)|Image-Generation]]
- 변형: [[Negative Prompt]]
- 응용: [[Stable-Diffusion]] · [[FLUX]] · [[Midjourney]] · [[ComfyUI]]
- Adjacent: [[CLIP]] · [[Diffusion-Models]] · [[ControlNet]]
## 🤖 LLM 활용
**언제**: image generation pipelines, fine-grained subject/style control, automated prompt synthesis.
**언제 X**: text-only LLM prompts (GPT/Claude don't use this syntax — use emphasis words instead), T5-only models.
## ❌ 안티패턴
- **Weight > 2.0**: 매 saturated artifacts, deformed output.
- **Stacking parens** `(((((word)))))`: hard to read; use explicit `(word:1.6)`.
- **A1111 syntax on FLUX/T5**: silently ignored — switch to natural language.
- **Weighting every token**: dilutes effect; pick 2-4 priorities.
- **Forgetting negative prompt**: often the right tool for "not X".
## 🧪 검증 / 중복
- Verified (AUTOMATIC1111 wiki, Compel docs, Midjourney v7 docs 2024-2025, FLUX official guidance).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — canonical prompt-weight ref + FLUX/T5 caveat |