feat(growth): Correction Loop — 정정 1회가 시스템 세 곳을 성장시키는 파이프라인 (v2.2.223)

self-evolving 고도화: 사용자 정정이 곧 Ground Truth — 정답지를 사람이 따로 만들지 않고, 태그 통계가 리포트에 머물지 않고 다음 턴의 행동을 바꾼다. ① 정정 감지·태깅 (correctionLoop.ts + agent.ts 훅, fire-and-forget): - "아니야/틀렸어/~가 아니라" 류 정정 발화 감지 (보수적 — 추임새 "아니"는 제외) - LLM 오류 분류 (사실오류/근거누락/맥락누락/추론오류/지시불이행/형식오류, 실패 시 휴리스틱 fallback) → error-tag frontmatter 레슨(lessons/) 저장 - 동시에 회귀 케이스 적립: .astra/eval/corrections.jsonl {질문, 틀린답, 정정} ② 주간 성장 사이클 확장 (1.5단계): - 정정 회귀 테스트: 정정받은 질문을 두뇌 검색 컨텍스트와 함께 재실행 → LLM-judge "같은 실수 반복?" 판정 → growth/regression-report.md (사이클당 ≤8건) - 약점 프로필: 최근 60일 태그 통계 → growth/weakness-profile.json ③ 결핍의 행동화 (memoryContext): - GROUNDING 약함 + agent scope 적용 중 → 전체 두뇌 1회 재검색 (scope 가 정답 문서를 가리는 경우 구제, 더 강한 근거일 때만 채택) - 그래도 약함 → 학습 큐에 지식 공백 자동 proposed 등록 (질문 해시 중복 차단, 20건 폭주 방지, 승인은 사람 — Permission Based Learning 유지) - 약점 프로필 → [자기검토] 블록 주입 (태그 2회 이상만): "너는 최근 X 정정을 N회 받았다 — <유형별 자기검토 지시>" 테스트 25건 추가 (감지 패턴·프로필 집계·큐 등록·영속화·fallback 분류). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 19:28:46 +09:00
parent 67927b1d4e
commit 72faa07480
7 changed files with 650 additions and 10 deletions
@@ -20,6 +20,7 @@ import { SessionManager } from './core/session';
 import { AgentWorkflowManager } from './agents/AgentWorkflowManager';
 import { buildAstraModeArchitectureContext } from './lib/contextBuilders/astraModeArchitecture';
 import { isScheduleRequest, buildScheduleContext } from './lib/contextBuilders/scheduleContext';
+import { looksLikeCorrection, captureCorrection } from './intelligence/correctionLoop';
 import { shouldUseMultiAgentWorkflow } from './lib/contextBuilders/multiAgentRouting';
 import { buildThinkingPartnerResponseContract } from './lib/contextBuilders/thinkingPartnerContract';
 import { buildDroppedHistorySummary } from './lib/contextBuilders/droppedHistorySummary';
@@ -536,6 +537,28 @@ export class AgentExecutor {
                }
            }

+            // [Correction Loop ①] 이 발화가 직전 답변에 대한 *정정*이면 fire-and-forget
+            // 캡처 — 오류 분류 → 태깅 레슨 + 회귀 케이스(.astra/eval/corrections.jsonl).
+            // 정정 자체가 Ground Truth 가 되어 주간 회귀 테스트·약점 프로필의 원료가 된다.
+            // 턴 응답을 막지 않는다 (await 없음).
+            if (prompt && loopDepth === 0 && activeBrain?.localBrainPath && looksLikeCorrection(prompt)) {
+                const visible = this.chatHistory.filter(m => !m.internal);
+                const lastAssistant = [...visible].reverse().find(m => m.role === 'assistant');
+                const lastUserIdx = lastAssistant ? visible.lastIndexOf(lastAssistant) - 1 : -1;
+                const priorQuestion = lastUserIdx >= 0 && visible[lastUserIdx]?.role === 'user' ? visible[lastUserIdx].content : '';
+                if (lastAssistant && priorQuestion) {
+                    void captureCorrection({
+                        brainPath: activeBrain.localBrainPath,
+                        question: priorQuestion,
+                        wrongAnswer: lastAssistant.content,
+                        correction: prompt,
+                        llm: { baseUrl: config.ollamaUrl, model: configDefaultModel },
+                    }).then(file => {
+                        if (file) logInfo('Correction Loop: 정정 캡처 완료.', { lesson: file });
+                    }).catch((e: any) => logError('Correction Loop 캡처 실패 (무시).', { error: e?.message ?? String(e) }));
+                }
+            }
+
            // 2. Setup History
            if (prompt !== null) {
                if (loopDepth === 0) {
@@ -33,6 +33,10 @@ import { runResearch, formatProposalMarkdown } from '../../intelligence/research
 import type { ExistingKnowledgeRef } from '../../intelligence/knowledgeValidation';
 import { loadQueue, saveQueue, mergeNeedsIntoQueue, formatQueueMarkdown, LEARNING_QUEUE_REL_PATH } from '../../intelligence/learningQueue';
 import { simpleChatCompletion } from '../../intelligence/llmCall';
+import {
+    loadCorrectionCases, computeWeaknessProfile, saveWeaknessProfile,
+    runRegressionCase, formatRegressionReport,
+} from '../../intelligence/correctionLoop';
 import { TelegramHttpClient } from '../../integrations/telegram/telegramClient';
 import { TELEGRAM_TOKEN_SECRET_KEY } from '../../extension/telegramCommands';

@@ -111,6 +115,49 @@ export async function runGrowthCycleOnce(context: vscode.ExtensionContext): Prom
        }
    } catch (e: any) { logError('성장 사이클: 검색 평가 실패.', { error: e?.message ?? String(e) }); }

+    // (1.5) Correction Loop — 정정 회귀 테스트 + 약점 프로필 갱신.
+    //  a. 약점 프로필: 최근 60일 정정 태그 통계 → weakness-profile.json
+    //     (memoryContext 가 다음 턴부터 자기검토 블록으로 주입 — 통계가 행동을 바꾼다)
+    //  b. 회귀: 최근 정정받은 질문을 두뇌 검색 컨텍스트와 함께 다시 풀어
+    //     "같은 실수 반복?" 을 LLM-judge 로 판정 — 정정이 곧 Ground Truth.
+    try {
+        const cases = loadCorrectionCases(brain.localBrainPath);
+        if (cases.length > 0) {
+            saveWeaknessProfile(brain.localBrainPath, computeWeaknessProfile(cases, now.getTime()));
+            if (config.defaultModel && config.ollamaUrl) {
+                const MAX_REGRESSION_PER_CYCLE = 8;
+                const recent = cases.slice(-MAX_REGRESSION_PER_CYCLE);
+                const orchestrator = new RetrievalOrchestrator();
+                const allFiles = findBrainFiles(brain.localBrainPath);
+                getBrainTokenIndex(brain.localBrainPath, allFiles);
+                const llm = { baseUrl: config.ollamaUrl, model: config.defaultModel };
+                const answerFn = async (question: string): Promise<string> => {
+                    // 실제 채팅과 동일하게 두뇌 근거를 주고 답하게 한다 (검색 없는 맨몸 답변은
+                    // 회귀 판정이 아니라 모델 암기 테스트가 되어버림).
+                    const refs = orchestrator.rankBrainForEval(question, brain, {
+                        limit: 5, chunkLevelRetrieval: config.chunkLevelRetrieval === true, chunkTargetChars: config.chunkTargetChars,
+                    }).slice(0, 5).map(r => {
+                        try { return `[${r.relativePath}]\n${fs.readFileSync(path.join(brain.localBrainPath, r.relativePath), 'utf8').slice(0, 1500)}`; }
+                        catch { return ''; }
+                    }).filter(Boolean).join('\n\n');
+                    return simpleChatCompletion(
+                        '두뇌 발췌를 근거로 간결히 답하라. 근거 없는 내용은 추정임을 명시하라.',
+                        `[두뇌 발췌]\n${refs || '(없음)'}\n\n[질문] ${question}`,
+                        { ...llm, temperature: 0.2, maxTokens: 700, timeoutMs: 120000 },
+                    );
+                };
+                const results = [];
+                for (const c of recent) results.push(await runRegressionCase(c, answerFn, llm));
+                fs.writeFileSync(
+                    path.join(growthDir, 'regression-report.md'),
+                    formatRegressionReport(results, { dateStr: now.toLocaleString() }), 'utf8',
+                );
+                const repeated = results.filter(r => r.repeated === true).length;
+                summary.push(`정정 회귀 ${results.length}건 중 재발 ${repeated}건${repeated > 0 ? ' ⚠️' : ''}`);
+            }
+        }
+    } catch (e: any) { logError('성장 사이클: 정정 회귀/약점 프로필 실패.', { error: e?.message ?? String(e) }); }
+
    // (2) 학습 큐 갱신 (Need Engine)
    let proposedCount = 0;
    try {
@@ -0,0 +1,379 @@
+/**
+ * Correction Loop — 사용자 정정 1회가 시스템을 세 군데서 성장시키는 단일 파이프라인.
+ *
+ *   사용자 정정 ("아니야, 그거 6월이야")
+ *     ① 감지(looksLikeCorrection) + LLM 오류 분류(classifyCorrection)
+ *     ├→ 태깅된 레슨 저장 (lessons/ — error-tag frontmatter)
+ *     └→ 회귀 케이스 적립 (.astra/eval/corrections.jsonl — {질문, 틀린답, 정정})
+ *     ② 주간 성장 사이클이 회귀 재검사("같은 실수 반복?") + 태그 통계 → 약점 프로필
+ *     ③ 약점 프로필(.astra/growth/weakness-profile.json)이 시스템 프롬프트에 자동 주입
+ *
+ * 설계 원칙:
+ *  - 정답지를 사람이 만들지 않는다 — 정정 자체가 Ground Truth.
+ *  - 통찰→행동 경로가 기계적이다 — 태그 통계가 리포트에 머물지 않고 다음 턴 프롬프트를 바꾼다.
+ *  - 투명성 — 프로필·케이스 모두 사람이 열어 수정/삭제 가능한 파일 (Permission Based Learning).
+ *  - 캡처는 fire-and-forget — 정정 턴의 응답 속도에 영향 없음.
+ */
+import * as fs from 'fs';
+import * as path from 'path';
+import { simpleChatCompletion } from './llmCall';
+import { loadQueue, saveQueue } from './learningQueue';
+
+// ── 타입/상수 ────────────────────────────────────────────────────────────────
+
+export const ERROR_TAGS = ['사실오류', '근거누락', '맥락누락', '추론오류', '지시불이행', '형식오류'] as const;
+export type ErrorTag = typeof ERROR_TAGS[number] | '기타';
+
+export interface CorrectionCase {
+    ts: string;             // ISO
+    errorTag: ErrorTag;
+    /** 틀린 답을 만들었던 사용자 질문. */
+    question: string;
+    /** 틀린 답 발췌 (저장 비용 제한). */
+    wrongAnswer: string;
+    /** 사용자의 정정 발화 — Ground Truth. */
+    correction: string;
+    /** 분류 LLM 이 뽑은 한 줄 요지 (레슨 제목 겸용). */
+    title: string;
+}
+
+export const CORRECTIONS_REL_PATH = path.join('.astra', 'eval', 'corrections.jsonl');
+export const WEAKNESS_PROFILE_REL_PATH = path.join('.astra', 'growth', 'weakness-profile.json');
+
+const MAX_FIELD_CHARS = 600;
+
+// ── ① 정정 감지 (보수적 — 오탐은 레슨 노이즈가 되므로 명확한 신호만) ─────────
+
+const CORRECTION_HEAD_RE = /^\s*(아니야|아니지|아닌데|틀렸|그게\s*아니|잘못\s*(알|됐|했)|땡|노노)/;
+const CORRECTION_ANY_RE = /(틀렸(어|네|잖|다)|사실(이|과)\s*(아니|다르|달라|다름)|정정(해|할게|하자)|잘못\s*된\s*정보|가\s*아니라\s*|이\s*아니라\s*|착각(했|하)|헛소리|지어내)/;
+
+/**
+ * 사용자 발화가 직전 답변에 대한 *정정*인지. 짧은 명령("아니 그거 말고 이거 해줘")과
+ * 구분하기 위해 직전 assistant 답변 존재는 호출부가 보장한다.
+ */
+export function looksLikeCorrection(prompt: string): boolean {
+    const t = (prompt || '').trim();
+    if (t.length < 4 || t.length > 1500) return false;
+    return CORRECTION_HEAD_RE.test(t) || CORRECTION_ANY_RE.test(t);
+}
+
+// ── ① 오류 분류 (LLM, 실패 시 휴리스틱 fallback) ────────────────────────────
+
+const CLASSIFY_SYSTEM = [
+    '너는 AI 답변 오류 분류기다. 사용자가 AI 답변을 정정했다. 오류 유형을 하나만 고르고 한 줄 요지를 써라.',
+    '유형: 사실오류(틀린 사실/수치/날짜), 근거누락(출처 없이 단정), 맥락누락(대화/문서 맥락 놓침), 추론오류(논리 비약), 지시불이행(요구사항 무시), 형식오류(형식/언어 문제)',
+    '반드시 JSON 한 줄만 출력: {"tag":"<유형>","title":"<요지 40자 이내>"}',
+].join('\n');
+
+export async function classifyCorrection(
+    question: string,
+    wrongAnswer: string,
+    correction: string,
+    llm: { baseUrl: string; model: string },
+): Promise<{ tag: ErrorTag; title: string }> {
+    const fallback = (): { tag: ErrorTag; title: string } => ({
+        tag: /(출처|근거|소스)/.test(correction) ? '근거누락'
+            : /(아까|위에|전에|문서|말했)/.test(correction) ? '맥락누락'
+            : '사실오류',
+        title: correction.slice(0, 40).replace(/\n/g, ' '),
+    });
+    try {
+        const user = [
+            `[질문] ${question.slice(0, MAX_FIELD_CHARS)}`,
+            `[AI 답변 발췌] ${wrongAnswer.slice(0, MAX_FIELD_CHARS)}`,
+            `[사용자 정정] ${correction.slice(0, MAX_FIELD_CHARS)}`,
+        ].join('\n');
+        const raw = await simpleChatCompletion(CLASSIFY_SYSTEM, user, {
+            baseUrl: llm.baseUrl, model: llm.model, temperature: 0.1, maxTokens: 120, timeoutMs: 30000,
+        });
+        const m = raw.match(/\{[\s\S]*?\}/);
+        if (!m) return fallback();
+        const parsed = JSON.parse(m[0]);
+        const tag = (ERROR_TAGS as readonly string[]).includes(parsed?.tag) ? parsed.tag as ErrorTag : '기타';
+        const title = String(parsed?.title || '').trim().slice(0, 60) || fallback().title;
+        return { tag, title };
+    } catch {
+        return fallback();
+    }
+}
+
+// ── ① 영속화: 회귀 케이스 + 태깅 레슨 ────────────────────────────────────────
+
+export function appendCorrectionCase(brainPath: string, c: CorrectionCase): boolean {
+    try {
+        const file = path.join(brainPath, CORRECTIONS_REL_PATH);
+        fs.mkdirSync(path.dirname(file), { recursive: true });
+        const row = {
+            ...c,
+            question: c.question.slice(0, MAX_FIELD_CHARS),
+            wrongAnswer: c.wrongAnswer.slice(0, MAX_FIELD_CHARS),
+            correction: c.correction.slice(0, MAX_FIELD_CHARS),
+        };
+        fs.appendFileSync(file, JSON.stringify(row) + '\n', 'utf8');
+        return true;
+    } catch {
+        return false;
+    }
+}
+
+export function loadCorrectionCases(brainPath: string, limit = 200): CorrectionCase[] {
+    try {
+        const file = path.join(brainPath, CORRECTIONS_REL_PATH);
+        if (!fs.existsSync(file)) return [];
+        const out: CorrectionCase[] = [];
+        for (const line of fs.readFileSync(file, 'utf8').split('\n')) {
+            const t = line.trim();
+            if (!t || t.startsWith('//')) continue;
+            try {
+                const o = JSON.parse(t);
+                if (o && typeof o.question === 'string' && typeof o.correction === 'string') out.push(o);
+            } catch { /* skip bad line */ }
+        }
+        return out.slice(-limit);
+    } catch {
+        return [];
+    }
+}
+
+/** 정정 레슨 카드 — 기존 lessons/ 템플릿과 같은 구조 + error-tag frontmatter. */
+export function correctionLessonMarkdown(c: CorrectionCase, today: string): string {
+    const safeTitle = c.title.replace(/\n/g, ' ').trim() || '사용자 정정';
+    return [
+        '---',
+        'type: lesson',
+        `title: ${safeTitle}`,
+        `error-tag: ${c.errorTag}`,
+        'applies-to: []',
+        'severity: medium',
+        'source: user-correction',
+        'occurrences: 1',
+        `last-seen: ${today}`,
+        '---',
+        '',
+        `# Lesson: ${safeTitle}`,
+        '',
+        '## Situation',
+        `사용자 질문: ${c.question}`,
+        '',
+        '## Mistake / Risk',
+        `[${c.errorTag}] AI 답변: ${c.wrongAnswer}`,
+        '',
+        '## Fix',
+        `사용자 정정 (Ground Truth): ${c.correction}`,
+        '',
+        '## Prevention Checklist',
+        `- 같은 질문 유형에서 [${c.errorTag}] 재발 여부 확인 — 주간 회귀 테스트 대상`,
+        '',
+    ].join('\n');
+}
+
+/**
+ * 정정 1건 캡처 — 분류 → 레슨 저장 + 회귀 케이스 적립. fire-and-forget 용
+ * (실패는 로그 대상이지 사용자 턴을 막지 않는다). 저장된 레슨 경로 또는 null 반환.
+ */
+export async function captureCorrection(opts: {
+    brainPath: string;
+    question: string;
+    wrongAnswer: string;
+    correction: string;
+    llm: { baseUrl: string; model: string };
+}): Promise<string | null> {
+    const { tag, title } = await classifyCorrection(opts.question, opts.wrongAnswer, opts.correction, opts.llm);
+    const now = new Date();
+    const c: CorrectionCase = {
+        ts: now.toISOString(),
+        errorTag: tag,
+        question: opts.question,
+        wrongAnswer: opts.wrongAnswer,
+        correction: opts.correction,
+        title,
+    };
+    appendCorrectionCase(opts.brainPath, c);
+    try {
+        const dir = path.join(opts.brainPath, 'lessons');
+        fs.mkdirSync(dir, { recursive: true });
+        const ymd = now.toISOString().slice(0, 10);
+        const slug = title.toLowerCase().replace(/[^a-z0-9가-힣]+/g, '-').replace(/^-+|-+$/g, '').slice(0, 50) || 'correction';
+        const file = path.join(dir, `${ymd}-correction-${slug}.md`);
+        fs.writeFileSync(file, correctionLessonMarkdown(c, ymd), 'utf8');
+        return file;
+    } catch {
+        return null;
+    }
+}
+
+// ── ②③ 약점 프로필 — 태그 통계를 다음 턴의 행동으로 변환 ────────────────────
+
+export interface WeaknessProfile {
+    updatedAt: string;
+    totalCases: number;
+    /** 최근 윈도우의 태그별 건수 (내림차순). */
+    tagCounts: Array<{ tag: string; count: number; example: string }>;
+}
+
+/** 최근 windowDays 의 정정 케이스에서 약점 프로필 산출 (성장 사이클이 주간 호출). */
+export function computeWeaknessProfile(cases: CorrectionCase[], nowMs: number, windowDays = 60): WeaknessProfile {
+    const cutoff = nowMs - windowDays * 86_400_000;
+    const recent = cases.filter(c => {
+        const t = Date.parse(c.ts);
+        return Number.isFinite(t) && t >= cutoff;
+    });
+    const byTag = new Map<string, { count: number; example: string }>();
+    for (const c of recent) {
+        const cur = byTag.get(c.errorTag) || { count: 0, example: '' };
+        cur.count++;
+        cur.example = c.title; // 최신 케이스 제목을 예시로
+        byTag.set(c.errorTag, cur);
+    }
+    return {
+        updatedAt: new Date(nowMs).toISOString(),
+        totalCases: recent.length,
+        tagCounts: Array.from(byTag.entries())
+            .map(([tag, v]) => ({ tag, count: v.count, example: v.example }))
+            .sort((a, b) => b.count - a.count),
+    };
+}
+
+export function saveWeaknessProfile(brainPath: string, profile: WeaknessProfile): boolean {
+    try {
+        const file = path.join(brainPath, WEAKNESS_PROFILE_REL_PATH);
+        fs.mkdirSync(path.dirname(file), { recursive: true });
+        fs.writeFileSync(file, JSON.stringify(profile, null, 2) + '\n', 'utf8');
+        return true;
+    } catch {
+        return false;
+    }
+}
+
+export function loadWeaknessProfile(brainPath: string): WeaknessProfile | null {
+    try {
+        const file = path.join(brainPath, WEAKNESS_PROFILE_REL_PATH);
+        if (!fs.existsSync(file)) return null;
+        const o = JSON.parse(fs.readFileSync(file, 'utf8'));
+        return o && Array.isArray(o.tagCounts) ? o as WeaknessProfile : null;
+    } catch {
+        return null;
+    }
+}
+
+/**
+ * 약점 프로필 → 시스템 프롬프트 자기검토 블록. 같은 태그 2회 이상일 때만 주입
+ * (1회성 실수로 프롬프트를 어지럽히지 않게). 프로필 없으면 ''.
+ */
+export function buildSelfReviewBlock(profile: WeaknessProfile | null): string {
+    if (!profile) return '';
+    const significant = profile.tagCounts.filter(t => t.count >= 2).slice(0, 2);
+    if (significant.length === 0) return '';
+    const lines = ['[자기검토 — 최근 정정 통계 기반]'];
+    for (const t of significant) {
+        lines.push(`- 너는 최근 "${t.tag}" 정정을 ${t.count}회 받았다 (예: ${t.example}). ${SELF_CHECK_BY_TAG[t.tag] || '같은 유형의 실수가 없는지 답하기 전 재확인하라.'}`);
+    }
+    return lines.join('\n');
+}
+
+const SELF_CHECK_BY_TAG: Record<string, string> = {
+    '사실오류': '수치·날짜·고유명사는 두뇌 근거가 없으면 단정하지 말고 "확인 필요"로 표시하라.',
+    '근거누락': '주장마다 근거 문서를 인용하고, 인용할 수 없으면 추정임을 명시하라.',
+    '맥락누락': '답하기 전 직전 대화와 제공된 문서에서 관련 맥락을 다시 확인하라.',
+    '추론오류': '결론 전에 추론 단계를 명시적으로 나열하고 비약이 없는지 점검하라.',
+    '지시불이행': '답하기 전 사용자 요구사항 목록을 만들고 각각 충족했는지 확인하라.',
+    '형식오류': '요구된 출력 형식(언어·구조·길이)을 답변 전에 재확인하라.',
+};
+
+// ── ③ 지식 공백 → 학습 큐 자동 proposed 등록 (Need Engine 연결) ──────────────
+
+/**
+ * GROUNDING 약함으로 판정된 질문을 학습 큐에 proposed 로 등록한다. 같은 질문은
+ * 1회만 (해시 id 중복 차단 — done/rejected 포함 어떤 상태든 재등록 안 함).
+ * 승인은 사람 (Permission Based Learning). 새로 등록됐으면 true.
+ */
+export function registerKnowledgeGap(brainPath: string, question: string, topScore: number): boolean {
+    const q = (question || '').trim();
+    if (q.length < 10) return false;
+    try {
+        const norm = q.toLowerCase().replace(/\s+/g, ' ').slice(0, 200);
+        let h = 5381;
+        for (let i = 0; i < norm.length; i++) h = ((h << 5) + h + norm.charCodeAt(i)) | 0;
+        const id = `gap-${(h >>> 0).toString(36)}`;
+        const queue = loadQueue(brainPath);
+        if (queue.some(item => item.id === id)) return false;
+        // 폭주 방지 — gap 제안이 20건 쌓여 있으면 사람이 정리할 때까지 추가 등록 중단.
+        if (queue.filter(item => item.id.startsWith('gap-') && item.status === 'proposed').length >= 20) return false;
+        const nowIso = new Date().toISOString();
+        queue.push({
+            id,
+            topic: `지식 공백: ${q.slice(0, 80)}`,
+            priority: 40,
+            reason: `대화 중 GROUNDING 약함 자동 감지 (두뇌 최고 점수 ${topScore.toFixed(2)})`,
+            status: 'proposed',
+            createdAt: nowIso,
+            updatedAt: nowIso,
+        });
+        saveQueue(brainPath, queue);
+        return true;
+    } catch {
+        return false;
+    }
+}
+
+// ── ② 주간 회귀 테스트 — "정정받은 질문에서 같은 실수를 반복하는가" ──────────
+
+export interface RegressionResult {
+    question: string;
+    errorTag: ErrorTag;
+    repeated: boolean | null; // null = 판정 실패
+    note: string;
+}
+
+const REGRESSION_JUDGE_SYSTEM = [
+    '너는 회귀 판정기다. 과거에 사용자가 AI 답변의 오류를 정정했다.',
+    '같은 질문에 대한 AI 의 *새 답변*이 같은 오류를 반복하는지 판정하라.',
+    '반드시 JSON 한 줄만 출력: {"repeated":true|false,"note":"<근거 30자>"}',
+].join('\n');
+
+/**
+ * 회귀 케이스 1건 재검사. answerFn 은 호출부가 주입 (성장 사이클: 두뇌 검색 컨텍스트
+ * 포함 LLM 호출). LLM-judge 가 정정 내용 대비 재발 여부를 판정.
+ */
+export async function runRegressionCase(
+    c: CorrectionCase,
+    answerFn: (question: string) => Promise<string>,
+    llm: { baseUrl: string; model: string },
+): Promise<RegressionResult> {
+    try {
+        const newAnswer = (await answerFn(c.question)).slice(0, 1500);
+        const user = [
+            `[질문] ${c.question}`,
+            `[과거 오류 (${c.errorTag})] ${c.wrongAnswer}`,
+            `[사용자 정정 (Ground Truth)] ${c.correction}`,
+            `[새 답변] ${newAnswer}`,
+        ].join('\n');
+        const raw = await simpleChatCompletion(REGRESSION_JUDGE_SYSTEM, user, {
+            baseUrl: llm.baseUrl, model: llm.model, temperature: 0.1, maxTokens: 100, timeoutMs: 60000,
+        });
+        const m = raw.match(/\{[\s\S]*?\}/);
+        if (!m) return { question: c.question, errorTag: c.errorTag, repeated: null, note: '판정 파싱 실패' };
+        const parsed = JSON.parse(m[0]);
+        return {
+            question: c.question,
+            errorTag: c.errorTag,
+            repeated: parsed?.repeated === true,
+            note: String(parsed?.note || '').slice(0, 60),
+        };
+    } catch (e: any) {
+        return { question: c.question, errorTag: c.errorTag, repeated: null, note: `실패: ${(e?.message ?? e)}`.slice(0, 60) };
+    }
+}
+
+export function formatRegressionReport(results: RegressionResult[], meta: { dateStr: string }): string {
+    const lines = [`# 정정 회귀 리포트 — ${meta.dateStr}`, ''];
+    lines.push('과거 사용자 정정(Ground Truth)을 같은 질문으로 재검사한 결과.');
+    lines.push('');
+    lines.push('| 결과 | 유형 | 질문 | 비고 |');
+    lines.push('|---|---|---|---|');
+    for (const r of results) {
+        const mark = r.repeated === true ? '❌ 재발' : r.repeated === false ? '✅ 통과' : '⚠️ 판정불가';
+        lines.push(`| ${mark} | ${r.errorTag} | ${r.question.slice(0, 60).replace(/\|/g, '/')} | ${r.note.replace(/\|/g, '/')} |`);
+    }
+    return lines.join('\n');
+}
@@ -8,6 +8,7 @@ import type { RetrievalOrchestrator } from '../../retrieval';
 import { buildLessonChecklistBlock } from '../../retrieval/lessonHelpers';
 import { embedQuery, embedTexts } from '../../retrieval/embeddings';
 import { backfillBrainEmbeddings, backfillBrainChunkEmbeddings } from '../../retrieval/brainIndex';
+import { loadWeaknessProfile, buildSelfReviewBlock, registerKnowledgeGap } from '../../intelligence/correctionLoop';
 import { resolveScopeForAgent } from '../../skills/agentKnowledgeMap';
 import {
    resolveKnowledgeMix,
@@ -196,7 +197,7 @@ export async function buildMemoryContext(deps: MemoryContextDeps): Promise<strin
        : undefined;

    // Unified RAG Pipeline 호출.
-    const result = deps.retrievalOrchestrator.retrieve(deps.currentPrompt, {
+    const retrieveOpts = {
        brain: deps.activeBrain,
        memoryManager: deps.memoryManager,
        workspacePath,
@@ -216,7 +217,20 @@ export async function buildMemoryContext(deps: MemoryContextDeps): Promise<strin
        hierarchicalReweightEnabled: config.hierarchicalReweightEnabled !== false,
        chunkLevelRetrieval: config.chunkLevelRetrieval === true,
        chunkTargetChars: config.chunkTargetChars,
-    });
+    };
+    let result = deps.retrievalOrchestrator.retrieve(deps.currentPrompt, retrieveOpts);
+
+    // [Correction Loop ③-a] 근거 약함 + agent scope 적용 중이면 전체 두뇌로 1회
+    // 재검색 — scope 가 정답 문서를 가리는 경우를 구제. 더 강한 근거가 나올 때만 채택.
+    if (scope.folders.length > 0 && assessGrounding(result).level === 'weak') {
+        try {
+            const retry = deps.retrievalOrchestrator.retrieve(deps.currentPrompt, { ...retrieveOpts, scopeFolders: [] });
+            if (assessGrounding(retry).level !== 'weak') {
+                retry.fusionLog.push('Grounding rescue: scope 해제 재검색 채택 (약함 → ' + assessGrounding(retry).level + ')');
+                result = retry;
+            }
+        } catch { /* 재검색 실패는 원 결과로 진행 */ }
+    }

    // Semantic Re-rank (LLM, async) — selectedChunks 의 *순서* 만 재배치. 토큰 예산을
    // 통과한 chunks 안에서 의도-부합도 순으로 재정렬해 LLM attention bias 활용.
@@ -380,8 +394,22 @@ export async function buildMemoryContext(deps: MemoryContextDeps): Promise<strin
    // [확신도 전역화] 검색 근거 강도를 평가해 답변 정책을 함께 주입 — /meet 의
    // "확신 없으면 단정 대신 표시" 원칙을 모든 대화로 확장. 근거가 약한데 단정적으로
    // 답하는 '그럴듯한 오답'을 구조적으로 줄인다.
-    const groundingBlock = buildGroundingBlock(result);
-    return [groundingBlock, lessonBlock, memoryBlock].filter(Boolean).join('\n\n');
+    const grounding = assessGrounding(result);
+    let groundingBlock = buildGroundingBlock(grounding);
+    // [Correction Loop ③-b] scope 완화 후에도 약함 → 지식 공백으로 학습 큐에 proposed
+    // 등록 (1회, 사람 승인 후 주간 사이클의 Research Agent 가 처리). 결핍을 표시에서
+    // 행동으로 — "스스로 모르는 것을 안다" 의 기계적 구현.
+    if (grounding.level === 'weak') {
+        try {
+            if (registerKnowledgeGap(deps.activeBrain.localBrainPath, deps.currentPrompt, grounding.top)) {
+                groundingBlock += '\n→ 이 질문은 지식 공백으로 학습 큐(proposed)에 등록되었다. 사용자에게 "이 주제는 두뇌에 근거가 부족해 학습 후보로 등록해 두었다"고 한 줄로 알려라.';
+            }
+        } catch { /* 큐 등록 실패가 turn 을 막지 않음 */ }
+    }
+    // [Correction Loop ③-c] 약점 프로필 → 자기검토 블록. 최근 정정 통계가 다음 턴의
+    // 행동을 직접 바꾼다 (태그 2회 이상만 — 1회성 실수로 프롬프트를 어지럽히지 않게).
+    const selfReviewBlock = buildSelfReviewBlock(loadWeaknessProfile(deps.activeBrain.localBrainPath));
+    return [selfReviewBlock, groundingBlock, lessonBlock, memoryBlock].filter(Boolean).join('\n\n');
 }

 /**
@@ -391,15 +419,20 @@ export async function buildMemoryContext(deps: MemoryContextDeps): Promise<strin
 *  - weak (top < 0.25 또는 두뇌 청크 0): 답변 첫 줄에 "⚠️ 두뇌 근거 약함" 표기 + 단정 금지.
 * 점수는 normalize 된 0~1 — 임계값은 초기치이며 골든셋으로 추후 튜닝 가능.
 */
-function buildGroundingBlock(result: { selectedChunks: Array<{ source: string; score: number }> }): string {
+interface GroundingAssessment { level: 'strong' | 'moderate' | 'weak'; top: number; count: number }
+
+function assessGrounding(result: { selectedChunks: Array<{ source: string; score: number }> }): GroundingAssessment {
    const brainChunks = result.selectedChunks.filter((c) => c.source === 'brain-trace' || c.source === 'brain-memory');
    const top = brainChunks.length ? Math.max(...brainChunks.map((c) => c.score || 0)) : 0;
-    let level: 'strong' | 'moderate' | 'weak';
+    let level: GroundingAssessment['level'];
    if (brainChunks.length === 0 || top < 0.25) level = 'weak';
    else if (top >= 0.5 && brainChunks.length >= 2) level = 'strong';
    else level = 'moderate';
+    return { level, top, count: brainChunks.length };
+}

-    const lines = [`[GROUNDING] 이번 질의의 두뇌 근거 강도: ${level === 'strong' ? '강함' : level === 'moderate' ? '보통' : '약함'} (두뇌 청크 ${brainChunks.length}개, 최고 점수 ${top.toFixed(2)})`];
+function buildGroundingBlock({ level, top, count }: GroundingAssessment): string {
+    const lines = [`[GROUNDING] 이번 질의의 두뇌 근거 강도: ${level === 'strong' ? '강함' : level === 'moderate' ? '보통' : '약함'} (두뇌 청크 ${count}개, 최고 점수 ${top.toFixed(2)})`];
    if (level === 'weak') {
        lines.push('→ 답변 첫 줄에 "⚠️ 두뇌 근거 약함 — 일반 지식 기반 추정입니다." 를 표기하고, 단정 대신 "가능성/추정" 표현을 사용하라. 확실하지 않은 세부 수치·고유명사는 만들지 말 것.');
    } else if (level === 'strong') {