feat: v2.2.63 — 한국어 오타 최소화 (채팅 temperature 설정 + anti-glitch 샘플링)

- streamer.ts: LM Studio SDK 호출에 topP/topK/minP/repeatPenalty 추가 — 저확률 오답 토큰을 잘라 한글 음절 깨짐(붕괴→붕점) 억제 - 채팅 기본 temperature 0.7 → 0.3 (분석/업무형 답변 안정화) - 신규 설정 g1nation.chatTemperature — Settings 패널 '고급' 섹션에서 조절 가능 (config.ts / settingsPanelProvider / settings-panel.html+js) chronicle 기록(ADR-0022, ADR-0023) 포함. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-22 19:09:01 +09:00
parent b0530db6f4
commit 49f941386f
20 changed files with 170 additions and 60 deletions
@@ -75,6 +75,15 @@ export class LMStudioStreamer implements IChatStreamer {
            const prediction = (model as any).respond(req.messages, {
                temperature: req.temperature,
                maxTokens: req.maxTokens ?? 4096,
+                // Glitch suppression: a small / quantized model samples wrong
+                // neighbour tokens (Korean syllable corruption like 붕괴→붕점,
+                // 핵심→핵점) when the distribution is left wide. A tight nucleus
+                // + top-k and a min-p floor cut the low-probability tail;
+                // repeatPenalty curbs stutter (것입니다서입니다).
+                topPSampling: 0.9,
+                topKSampling: 20,
+                minPSampling: 0.05,
+                repeatPenalty: 1.1,
                // Safety net: if our own token budgeting still underestimated and the prompt
                // exceeds the model's context window, decide whether the SDK should fail
                // loudly (stopAtLimit — default) or silently drop content.