chore: v2.2.73 — ASTRA-DEBUG 로그 레벨 + webview CSP font-src 보강

- ASTRA-DEBUG 정상 흐름 로그를 console.error → logInfo/console.log 로 강등 (chatHandlers, extension, slashRouter): DevTools에 ERR로 찍히던 오탐 제거 - sidebar webview에 명시적 CSP meta 추가 + font-src에 data: 허용 (sidebar.html, sidebarProvider._getHtml): VS Code outer iframe이 codicon.ttf를 data:font/ttf 로 inject하면서 기본 CSP에 막혀 매 prompt 마다 violation 경고가 찍히던 문제 해소 - 누적된 LM Studio / agent / 컨텍스트 매니저 / 테스트 갱신 동반 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 15:52:19 +09:00
parent 36db170844
commit 0712014fcb
43 changed files with 2417 additions and 977 deletions
@@ -2,7 +2,7 @@
  "name": "astra",
  "displayName": "Astra",
  "description": "The personal intelligence layer for Antigravity and VS Code. A private cognitive partner for deep project context, memory, and proactive strategic decision-making.",
-  "version": "2.2.64",
+  "version": "2.2.73",
  "publisher": "g1nation",
  "license": "MIT",
  "icon": "assets/icon.png",
@@ -342,6 +342,74 @@
          "default": true,
          "description": "Automatically load LM Studio models into memory when selected from the Astra sidebar."
        },
+        "g1nation.lmStudio.sampling.topP": {
+          "type": "number",
+          "default": 0.9,
+          "minimum": 0,
+          "maximum": 1,
+          "description": "Nucleus sampling cutoff. Small / quantized models often spew wrong-neighbour tokens (한글 깨짐: 붕괴→붕점) when the tail is wide. Lower (0.8–0.9) tightens; 1.0 disables. Applied to both SDK and REST paths."
+        },
+        "g1nation.lmStudio.sampling.topK": {
+          "type": "number",
+          "default": 20,
+          "minimum": 0,
+          "description": "Top-K sampling cutoff. 0 disables. Default 20 — tighter for small models, raise to 40–80 for large models that already sample well."
+        },
+        "g1nation.lmStudio.sampling.minP": {
+          "type": "number",
+          "default": 0.05,
+          "minimum": 0,
+          "maximum": 1,
+          "description": "Min-P floor — discards tokens with probability below this fraction of the top token. Good defence against rare-token glitches. 0 disables."
+        },
+        "g1nation.lmStudio.sampling.repeatPenalty": {
+          "type": "number",
+          "default": 1.1,
+          "minimum": 1,
+          "maximum": 2,
+          "description": "Repeat / frequency penalty to curb stutter (것입니다서입니다…). 1.0 disables. Values 1.05–1.2 are typical."
+        },
+        "g1nation.lmStudio.statsInBudget": {
+          "type": "boolean",
+          "default": true,
+          "description": "Show token/s and time-to-first-token from LM Studio prediction stats in the context-budget badge after each turn (SDK path only)."
+        },
+        "g1nation.lmStudio.draftModel": {
+          "type": "string",
+          "default": "",
+          "description": "[Speculative decoding] LM Studio model key of a small draft model (e.g. 'gemma-2b-it') used to accelerate the main model. Empty disables. 1.5–3x throughput on large models. The draft must be downloaded in LM Studio (load is automatic on first use)."
+        },
+        "g1nation.lmStudio.load.flashAttention": {
+          "type": "boolean",
+          "default": true,
+          "description": "[Load option] Enable Flash Attention when loading models. Faster generation + lower memory on compatible hardware, especially helpful for long contexts. Default: true."
+        },
+        "g1nation.lmStudio.load.gpuOffloadRatio": {
+          "type": "string",
+          "default": "max",
+          "description": "[Load option] How much of the model to offload to GPU. 'max' = all (default), 'off' = CPU only, or a number 0–1 (e.g. '0.5' = half). Numeric strings are parsed."
+        },
+        "g1nation.lmStudio.load.offloadKVCacheToGpu": {
+          "type": "boolean",
+          "default": true,
+          "description": "[Load option] Keep KV cache on GPU memory. Faster but requires VRAM headroom. Default: true."
+        },
+        "g1nation.lmStudio.load.keepModelInMemory": {
+          "type": "boolean",
+          "default": true,
+          "description": "[Load option] Prevent the model from being swapped out of system memory. Improves interactive responsiveness; raises RAM use. Default: true."
+        },
+        "g1nation.lmStudio.load.useFp16ForKVCache": {
+          "type": "boolean",
+          "default": false,
+          "description": "[Load option] Store KV cache in FP16 (halves cache memory). Tiny quality impact for most models — try if you run out of VRAM at long contexts. Default: false."
+        },
+        "g1nation.lmStudio.load.evalBatchSize": {
+          "type": "number",
+          "default": 0,
+          "minimum": 0,
+          "description": "[Load option] Token batch size during evaluation. 0 = engine default. Higher (512–1024) improves prefill speed on GPU at the cost of memory."
+        },
        "g1nation.localBrainPath": {
          "type": "string",
          "default": "",
@@ -484,6 +552,40 @@
          "default": true,
          "description": "Persist substantive Reflector critiques to the active brain as lesson cards under `lessons/auto-reflector/`. Future missions automatically retrieve these cards (via the existing Experience-Memory pipeline) and inject them as ‘[⚠ ACTIVE LESSONS — verify these BEFORE finalizing]’ guardrails into Planner/Researcher/Writer context. A repeated critique (similar title) bumps `occurrences` and escalates `severity` (low→medium→high) instead of duplicating the card, so recurring patterns get louder over time. Disable to keep critiques single-mission only."
        },
+        "g1nation.workflow.synthesizerEnabled": {
+          "type": "boolean",
+          "default": true,
+          "markdownDescription": "5단계 파이프라인의 마지막 단계로 **Synthesizer**(최종 다듬기) 패스를 한 번 더 돌릴지 여부. true(기본): Drafter가 만든 1차 초안을 Synthesizer가 받아 도입 한 줄·섹션 흐름·결론을 정리해 사용자용 최종 답변으로 만든다. 입력이 작은 draft 뿐이라 컨텍스트가 가벼워 작은 로컬 모델(≤4B)도 부담 없이 처리한다. false: Drafter 출력이 그대로 최종 답변이 된다(기존 4단계 동작)."
+        },
+        "g1nation.workflow.multiAgentMode": {
+          "type": "string",
+          "enum": ["auto", "always", "off"],
+          "default": "auto",
+          "markdownDescription": "Multi-Agent(5단계) 파이프라인 발동 모드.\n\n- `auto` (기본): 작은 모델(≤4B) 감지, 큰 prompt(컨텍스트의 30%+), 명시적 키워드(보고서/리뷰/심층 분석…), 또는 사용자가 `multiAgentEnabled`를 켰을 때 자동으로 발동. 짧은 인사·잡담은 제외.\n- `always`: 인사·잡담을 제외한 모든 요청에 5단계 파이프라인 사용. 작은 모델로도 답변이 한 번에 끝나지 않는다면 이 모드가 안정적.\n- `off`: 기존 키워드/길이 휴리스틱 + 수동 `multiAgentEnabled` 토글만 사용 (legacy 동작)."
+        },
+        "g1nation.workflow.autoCtxFractionThreshold": {
+          "type": "number",
+          "default": 0.30,
+          "minimum": 0.05,
+          "maximum": 0.95,
+          "markdownDescription": "`workflow.multiAgentMode = auto` 일 때, prompt 토큰이 효과적 context window 의 이 비율(0~1)을 넘으면 5단계 파이프라인을 강제 발동. 기본 0.30 — 작은 모델이 input으로 컨텍스트의 30% 이상을 먹기 시작하면 한 번에 답하려다 EOS/잘림이 잘 발생한다."
+        },
+        "g1nation.liveStreamTokens": {
+          "type": "boolean",
+          "default": true,
+          "markdownDescription": "모델 토큰을 받는 즉시 채팅 버블에 흘려보낼지 여부.\n\n- `true` (기본): 토큰을 받는 즉시 표시 → TTFT 체감 속도 향상. 생성이 끝나면 sanitize 된 최종 답변으로 `streamReplace` 가 한 번에 교체하므로 control token 노출은 잠깐만 가능.\n- `false`: 토큰을 내부에서만 누적, sanitize(`<|channel|>thought` / `<think>` / `Thinking Process:` 등 제거) 끝난 **최종 답변만 한 번에** 표시. 모델의 control token 이 잠깐이라도 화면에 노출되는 누설을 원천 차단."
+        },
+        "g1nation.outputFormat": {
+          "type": "string",
+          "enum": ["plain", "markdown"],
+          "default": "plain",
+          "markdownDescription": "최종 답변 표시 방식.\n\n- `plain` (기본): 모델이 무심코 내보낸 마크다운 마커(`##`, `**`, `__`, `> `, `* ` 등)를 후처리로 모두 제거. 섹션 라벨 텍스트(예: `핵심 요약`)는 유지되지만 헤더 마커는 사라져 깔끔한 plain text 로 보임. 작은 로컬 모델이 학습된 습관으로 `## 다음 한 수` 같은 마커를 흘리는 문제 차단.\n- `markdown`: legacy 동작. 모델 출력을 그대로 렌더러에 넘김."
+        },
+        "g1nation.chronicleAutoRecord": {
+          "type": "boolean",
+          "default": true,
+          "markdownDescription": "자동 기록 (Project Chronicle Auto-Record).\n\n- `true` (기본): 매 chat turn 후 의미 있는 대화(planning / decision / bug / development / discussion 유형 자동 판별)를 활성 프로젝트의 Chronicle 폴더에 자동 저장.\n- `false`: 자동 저장 OFF. 수동 기록 (도구 ▾ 의 기록 항목, `/wiki` 등) 은 계속 가능.\n\n사이드바 **도구 ▾** 메뉴의 `자동 기록` 토글로 즉시 전환 가능 — 설정 패널까지 들어갈 필요 없음."
+        },
        "g1nation.company.intentClassifierModel": {
          "type": "string",
          "default": "",