f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
229 lines
7.3 KiB
Markdown
229 lines
7.3 KiB
Markdown
---
|
|
id: wiki-2026-0508-모델-매개변수-제어-model-parameter-contr
|
|
title: 모델 매개변수 제어 (Model Parameter Control)
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [Model Parameter Control, Inference Parameters, Sampling Parameters]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [parameters, sampling, inference, llm, image-gen]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: python
|
|
framework: openai-anthropic-vllm-comfyui
|
|
---
|
|
|
|
# 모델 매개변수 제어 (Model Parameter Control)
|
|
|
|
## 매 한 줄
|
|
> **"매 parameter 는 model behavior 의 dial — temperature, top_p, top_k, seed 가 매 generation 의 character 를 결정"**. 2026년 production LLM 의 매 endpoint 가 노출하는 sampling knobs (Anthropic, OpenAI, vLLM, Ollama) + image gen 의 cfg/steps/scheduler — 매 정량적 control 의 핵심.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 LLM sampling parameters
|
|
- **temperature** [0, 2]: logit scaling. 0 = greedy, 1 = raw distribution, >1 = flatten. 매 deterministic task 는 0, creative 는 0.7~1.0.
|
|
- **top_p** (nucleus) [0, 1]: cumulative prob mass. 0.9 = 매 top-90% mass tokens 만 sample.
|
|
- **top_k**: 매 top-K logits 만 유지. vLLM 은 -1 = disabled.
|
|
- **min_p** [0, 1]: relative threshold (vs top token prob). 매 modern alternative to top_p.
|
|
- **frequency_penalty** [-2, 2] / **presence_penalty**: repetition control.
|
|
- **seed**: reproducibility. 매 same seed + temperature=0 → deterministic (대부분).
|
|
- **stop**: 매 stop strings. 매 agent loop 의 turn boundary 제어.
|
|
- **max_tokens** / **max_completion_tokens**: output budget.
|
|
|
|
### 매 image gen parameters (FLUX, SD3.5, Midjourney)
|
|
- **cfg / guidance_scale**: prompt adherence vs creativity. FLUX 3.5~5.0, SD 5~9.
|
|
- **steps**: denoising steps. FLUX-dev 28, FLUX-schnell 4, SD3.5 28~40.
|
|
- **scheduler / sampler**: euler, dpmpp_2m, etc. 매 quality/speed tradeoff.
|
|
- **seed**: 매 reproducible composition.
|
|
- **denoising_strength** (img2img): 0 = identical, 1 = ignore source.
|
|
|
|
### 매 응용
|
|
1. RAG answer extraction → temperature=0, top_p=1.
|
|
2. Brainstorm → temperature=0.9, presence_penalty=0.6.
|
|
3. Code completion → temperature=0.2, stop=["\n\n"].
|
|
4. Image variation → 매 seed fix + cfg lower.
|
|
|
|
## 💻 패턴
|
|
|
|
### Anthropic Claude — deterministic extraction
|
|
```python
|
|
from anthropic import Anthropic
|
|
|
|
client = Anthropic()
|
|
resp = client.messages.create(
|
|
model="claude-opus-4-7",
|
|
max_tokens=1024,
|
|
temperature=0.0, # deterministic
|
|
top_p=1.0,
|
|
system="Extract structured data. Output JSON only.",
|
|
messages=[{"role": "user", "content": doc_text}],
|
|
)
|
|
```
|
|
|
|
### OpenAI GPT-5 — creative writing knobs
|
|
```python
|
|
from openai import OpenAI
|
|
|
|
client = OpenAI()
|
|
resp = client.chat.completions.create(
|
|
model="gpt-5",
|
|
temperature=0.9,
|
|
top_p=0.95,
|
|
presence_penalty=0.6,
|
|
frequency_penalty=0.3,
|
|
max_completion_tokens=2000,
|
|
seed=42, # best-effort reproducibility
|
|
messages=[{"role": "user", "content": "Write a noir opening."}],
|
|
)
|
|
```
|
|
|
|
### vLLM — full sampling control (self-host Llama 3.3)
|
|
```python
|
|
from vllm import LLM, SamplingParams
|
|
|
|
llm = LLM(model="meta-llama/Llama-3.3-70B-Instruct", tensor_parallel_size=4)
|
|
|
|
params = SamplingParams(
|
|
temperature=0.7,
|
|
top_p=0.9,
|
|
top_k=50,
|
|
min_p=0.05, # modern alternative
|
|
repetition_penalty=1.1,
|
|
max_tokens=512,
|
|
stop=["</answer>"],
|
|
seed=2026,
|
|
logprobs=5, # debugging
|
|
)
|
|
outputs = llm.generate(["Explain mixture-of-experts."], params)
|
|
```
|
|
|
|
### MLX (Apple Silicon) — local inference with seed
|
|
```python
|
|
from mlx_lm import load, generate
|
|
import mlx.core as mx
|
|
|
|
model, tok = load("mlx-community/Llama-3.3-70B-Instruct-4bit")
|
|
mx.random.seed(42)
|
|
text = generate(
|
|
model, tok,
|
|
prompt="Summarize:",
|
|
max_tokens=256,
|
|
temp=0.3,
|
|
top_p=0.9,
|
|
verbose=False,
|
|
)
|
|
```
|
|
|
|
### FLUX.1-dev via diffusers — image gen knobs
|
|
```python
|
|
import torch
|
|
from diffusers import FluxPipeline
|
|
|
|
pipe = FluxPipeline.from_pretrained(
|
|
"black-forest-labs/FLUX.1-dev",
|
|
torch_dtype=torch.bfloat16,
|
|
).to("cuda")
|
|
|
|
img = pipe(
|
|
prompt="cinematic neo-tokyo alley, neon, rain",
|
|
guidance_scale=3.5, # FLUX prefers low CFG
|
|
num_inference_steps=28,
|
|
generator=torch.Generator("cuda").manual_seed(42),
|
|
width=1024, height=1024,
|
|
).images[0]
|
|
```
|
|
|
|
### ComfyUI API — programmatic SD3.5 with full control
|
|
```python
|
|
import json, requests
|
|
|
|
workflow = {
|
|
"sampler": {
|
|
"class_type": "KSampler",
|
|
"inputs": {
|
|
"seed": 42, "steps": 30, "cfg": 7.0,
|
|
"sampler_name": "dpmpp_2m", "scheduler": "karras",
|
|
"denoise": 1.0,
|
|
"model": ["loader", 0],
|
|
"positive": ["pos_clip", 0],
|
|
"negative": ["neg_clip", 0],
|
|
"latent_image": ["empty_latent", 0],
|
|
},
|
|
},
|
|
# ... rest of graph
|
|
}
|
|
r = requests.post("http://localhost:8188/prompt", json={"prompt": workflow})
|
|
```
|
|
|
|
### Sweep parameters with Optuna for prompt+param tuning
|
|
```python
|
|
import optuna
|
|
from anthropic import Anthropic
|
|
|
|
client = Anthropic()
|
|
EVAL_SET = load_eval() # list[(prompt, expected)]
|
|
|
|
def objective(trial):
|
|
temp = trial.suggest_float("temperature", 0.0, 1.2)
|
|
tp = trial.suggest_float("top_p", 0.5, 1.0)
|
|
score = 0
|
|
for q, exp in EVAL_SET:
|
|
out = client.messages.create(
|
|
model="claude-opus-4-7",
|
|
max_tokens=512, temperature=temp, top_p=tp,
|
|
messages=[{"role": "user", "content": q}],
|
|
).content[0].text
|
|
score += grade(out, exp)
|
|
return score / len(EVAL_SET)
|
|
|
|
study = optuna.create_study(direction="maximize")
|
|
study.optimize(objective, n_trials=40)
|
|
print(study.best_params)
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| Task | temperature | top_p | 기타 |
|
|
|---|---|---|---|
|
|
| Extraction / classification | 0.0 | 1.0 | seed 고정 |
|
|
| Code completion | 0.2 | 0.95 | stop tokens |
|
|
| Summarization | 0.3 | 0.9 | — |
|
|
| Q&A (RAG) | 0.0~0.3 | 1.0 | — |
|
|
| Brainstorming | 0.8~1.0 | 0.95 | presence_penalty 0.6 |
|
|
| Creative fiction | 0.9~1.1 | 0.95 | frequency_penalty 0.3 |
|
|
| FLUX image | cfg 3.5 | steps 28 | bf16 |
|
|
| SD3.5 image | cfg 7.0 | steps 30 | dpmpp_2m karras |
|
|
|
|
**기본값**: temperature=0.7, top_p=0.9, seed=42 (debugging), max_tokens=task-budgeted.
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[Parameter]]
|
|
- 변형: [[Sampling_Strategies]]
|
|
- 응용: [[Iterative Prompting]] · [[Midjourney]] · [[RAG]]
|
|
- Adjacent: [[Prompt_Engineering]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: 매 deterministic 결과 필요 (RAG, extraction) — temp=0. 매 creative output — temp 0.7+. 매 reproduce bug — seed 고정.
|
|
**언제 X**: 매 model 마다 seed 의 strict determinism 보장 X (특히 multi-GPU). 매 production 에서 seed 의존 X.
|
|
|
|
## ❌ 안티패턴
|
|
- **temperature=0 + top_p<1**: 매 redundant (greedy 가 이미 top-1).
|
|
- **temperature 1.5+ in production**: 매 hallucination/incoherence spike.
|
|
- **seed 만 고정 + temperature 0.7**: 매 batched inference 에서 비결정적.
|
|
- **max_tokens=4096 default**: 매 cost blowup. Task-budgeted.
|
|
- **frequency_penalty 1.5+**: 매 vocabulary collapse.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (Anthropic Messages API, OpenAI Chat Completions, vLLM SamplingParams, diffusers FluxPipeline, Stability SD3.5 docs, ComfyUI API).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — LLM/image sampling params + 7 working patterns |
|