feat: Self-Evolving Digital Employee OS P0~P6 + 캘린더 충돌 게이트

신뢰성 코어 (P1~P2): - Requirement Graph: 업무 유형(회의록/시장조사/업무조사/일정) 필수 요소 주입 + 커버리지 hook - Confidence Engine(0~100 결정론적) / Escalation Engine(검토 요청) / Epistemic Guard(모름·추정·확실 3분류) - Provenance: citationTrace 에 출처 수정일·오래됨 경고 - Critic Loop: 문제 신호 turn 만 LLM 검수 1회 + 보완 카드 성장 루프 (P3): - Gap Detector(Requirement-Knowledge) / Need Engine(30/25/20/15/10 공식) / Knowledge Inventory - Learning Queue(proposed 전용 병합 — 승인은 사람만) / Decision Journal / Reflection 기록 - 반복 누락 요소(3회+)는 다음 turn 체크리스트에 자동 강조 (T5 루프) 지식 운영 (P4) + 기억 (P5) + 학습 실행 (P6): - Knowledge Validation + Belief Revision(중복 reject·충돌 시 update/add 권고) - Knowledge Decay(분야별 반감기 감사) / Knowledge Debt(blocked x impact) - Organizational Memory(.astra/organization.md 상시 주입) - Research Agent(approved 큐 -> 조사 브리프+추정 라벨 초안+Validation 게이트 -> proposals/) - Skill Score(전/후반 추세) + Success Pattern DB(전요소충족+확신도90+ 자동 적재) 병렬 트랙: - 캘린더 충돌 게이트: conflictCheck + 구조화 이벤트 캐시 + create_calendar_event 차단(force 는 사용자 승인 후) - Task Eval Harness: 회의록 골든셋 자동 채점 명령 + 성장 리포트/학습 큐/노후 점검 명령 신규 모듈 17종(src/intelligence/), VS Code 명령 5종, 설정 11종, 테스트 +89건(전체 508 통과). 설계 문서: docs/SELF_EVOLVING_OS_MASTER_PLAN.md Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 13:42:09 +09:00
parent cbc2558550
commit 2afd1ac589
41 changed files with 4364 additions and 2 deletions
@@ -0,0 +1,38 @@
+# 업무별 골든셋 템플릿 (Self-Evolving OS Phase 0 / Track 0-3)
+
+ASTRA의 업무 산출물 품질을 측정하기 위한 골든셋 템플릿.
+기존 검색 골든셋(`<brain>/.astra/eval/golden.jsonl`, retrieval recall 평가)과 별개로,
+**업무 결과물 자체**(회의록/시장조사/업무조사)를 평가한다.
+
+## 사용 방법
+
+1. 각 `.golden.jsonl` 템플릿을 활성 brain의 `.astra/eval/tasks/` 로 복사
+2. 실제 과거 업무 데이터로 5~10건씩 채움 (한 줄 = JSON 1건, `//` 시작 줄은 주석)
+3. Phase 3 Self Evaluation 모듈이 이 골든셋을 읽어 자동 채점 (evalHarness 패턴 확장)
+
+## 레코드 형식
+
+```jsonc
+{
+  "id": "mm-001",                      // 업무유형 약자 + 일련번호
+  "query": "사용자가 실제로 입력한 요청",
+  "input": "원자료 (회의 메모 원문, 조사 주제 배경 등)",
+  "expectedElements": ["참석자", "결정사항"], // 반드시 포함돼야 할 요소 (requirementGraph 의 label 과 일치)
+  "reference": "모범 결과물 전문 또는 핵심 포인트 목록",
+  "notes": "채점 시 주의사항 (선택)"
+}
+```
+
+`expectedElements` 는 `src/intelligence/requirementGraph.ts` 의
+`DEFAULT_TASK_REQUIREMENTS` element label 과 일치시킬 것 — 커버리지 검사와
+Self Evaluation 이 같은 어휘를 쓰도록.
+
+## 채점 기준 (Self Evaluation 에서 사용)
+
+| 항목 | 척도 |
+|------|------|
+| 필수 요소 충족률 | expectedElements 커버 비율 (결정론적) |
+| 정확성 | 1~10 (reference 대비) |
+| 논리성 | 1~10 |
+| 가독성 | 1~10 |
+| 사실 오류 | 개수 (0이 목표) |
@@ -0,0 +1,2 @@
+// 시장조사 골든셋 — 실제 과거 조사 업무로 교체할 것 (5~10건 권장). 아래는 형식 예시 1건.
+{"id":"mr-001","query":"국내 로봇청소기 시장조사 해줘","input":"신제품 기획 검토용. 프리미엄 라인 진입 여부 판단 목적.","expectedElements":["시장 규모","성장률","경쟁사","가격","고객 니즈","트렌드","출처"],"reference":"시장 규모(금액·수치+출처), 연 성장률, 주요 경쟁사와 포지션, 가격대 분포, 고객 페인 포인트, 최근 트렌드, 모든 핵심 수치에 출처 명시","notes":"수치에 출처가 없으면 '(확인 필요)' 표기했는지 확인 — 환각 수치는 실격"}
@@ -0,0 +1,2 @@
+// 회의록 골든셋 — 실제 과거 회의 데이터로 교체할 것 (5~10건 권장). 아래는 형식 예시 1건.
+{"id":"mm-001","query":"오늘 주간회의 내용 회의록으로 정리해줘","input":"6/9 주간회의 메모: 김OO 이OO 박OO 참석. 신제품 출시일 7월 15일로 확정. 김OO이 6/20까지 견적서 발송하기로. 마케팅 예산은 다음 회의에서 재논의.","expectedElements":["참석자","결정사항","액션 아이템","담당자","기한"],"reference":"참석자: 김OO, 이OO, 박OO / 결정사항: 신제품 출시일 7/15 확정, 마케팅 예산은 차기 회의 재논의 / 액션 아이템: 견적서 발송 (담당: 김OO, 기한: 6/20)","notes":"미결 항목(마케팅 예산)을 결정사항과 구분해 표기했는지 확인"}
@@ -0,0 +1,2 @@
+// 업무조사 골든셋 — 실제 과거 조사 요청으로 교체할 것 (5~10건 권장). 아래는 형식 예시 1건.
+{"id":"wr-001","query":"MCP 프로토콜에 대해 조사해줘","input":"ASTRA에 외부 도구를 연결할 때 표준으로 쓸지 판단하기 위한 조사.","expectedElements":["조사 목적","핵심 요약","세부 내용","출처","시사점·다음 단계"],"reference":"목적 한 줄 → 3줄 요약 → 상세(아키텍처/생태계/한계) → 출처 → ASTRA 적용 시사점과 권장 다음 단계","notes":"모델 일반 지식과 검색 근거를 구분해 표기했는지 확인"}
@@ -0,0 +1,275 @@
+# ASTRA Self-Evolving Digital Employee OS — 마스터 개발 계획 v1.1
+
+> 작성일: 2026-06-11
+> 기준 문서: "Self-Evolving Digital Employee OS v1.0" 설계서 (사용자·LLM 공동 설계)
+> 재구성 원칙: **신뢰성 우선(Trust-First)** — 전 모듈을 빠짐없이 개발하되, 순서는 신뢰 → 품질 → 성장 루프 → 운영 → 고급 학습 순으로 재배열
+
+---
+
+## 1. 비전과 목표
+
+ASTRA를 **사용자가 의존하고 신뢰할 수 있는 AI 디지털 직원**으로 만든다.
+
+- **주 업무**: 회의록 작성, 일정 관리, 시장 조사, 업무 조사 (지속 확장 예정)
+- **부 업무(minor)**: 블로그 글, 쇼츠/YouTube 스크립트, 이미지 프롬프트
+- **기반 모델**: Gemma 4 (로컬, LM Studio) — 모델 자체를 바꾸지 않고 주변 시스템으로 성능을 만든다
+  - 체감 품질 구성비: 모델 20% + 프롬프트 30% + RAG 30% + 평가 20%
+
+### 신뢰의 5조건 (전 모듈의 존재 이유)
+
+| # | 조건 | 담당 모듈 |
+|---|------|----------|
+| T1 | 모르면 모른다고 말한다 | Anti-Hallucination Layer |
+| T2 | 근거를 제시·역추적할 수 있다 | Knowledge Provenance, Decision Journal |
+| T3 | 품질이 일관적이다 (필수 요소 누락 없음) | Requirement Graph, Critic Agent |
+| T4 | 확신이 없으면 사람에게 묻는다 | Confidence Engine, Escalation Engine |
+| T5 | 같은 실수를 반복하지 않는다 | Failure Pattern DB, 성장 루프 4종 |
+
+### 최종 상태
+
+```
+업무 수행 → 자기 평가 → 부족함 발견 → 학습 필요 정의 → 학습 → 검증 → 역량 향상 → 다음 업무
+```
+
+이 루프가 인간 개입 없이(단, 승인 게이트는 유지) 돌아가는 상태.
+
+---
+
+## 2. 비목표 (Out of Scope — 코드 가드레일로 구현)
+
+1. 독자적 목표 생성 금지
+2. 사용자 목적 변경 금지
+3. 무제한 자율 학습 금지
+4. 승인 없는 장기 기억 저장 금지
+5. 승인 없는 외부 액션 금지
+6. 자기 코드 수정 금지
+7. 자기 복제 금지
+
+---
+
+## 3. 현재 자산 맵 (설계 모듈 ↔ 기존 코드)
+
+| 설계 모듈 | 기존 자산 | 작업 성격 |
+|---|---|---|
+| Anti-Hallucination | `src/retrieval/coveBlock.ts`, `src/agent/postHocSelfCheck.ts`, `src/agent/termValidator.ts` | 확장 |
+| Knowledge Provenance | `src/retrieval/citationTrace.ts` | 확장 |
+| Belief Revision / 충돌 | `src/retrieval/conflictBlock.ts`, `src/core/conflict.ts` | 확장 |
+| RAG + 평가 | `src/retrieval/chunker.ts`, `evalHarness.ts` + 골든셋 | 진행 중 (청킹 개선) |
+| Memory Layer | `src/memory/` (Episodic/LongTerm/ShortTerm/Procedural/Project + Extractor + distillation) | 대부분 보유 |
+| Worker Agent / 오케스트레이션 | `src/agents/AgentWorkflowManager.ts`, `factory.ts` | 보유 |
+| 지침 자동 주입 | `src/skills/skillInjectionService.ts`, `scopedBrainRetriever.ts` | 보유 |
+| Learning Queue 인프라 | `src/core/queue.ts`, `events.ts` | 재사용 |
+| 외부 도구 연동 | Datacollect MCP Bridge (:3002) | 패턴 재사용 |
+| 의도 명확화 | `src/retrieval/intentClarification.ts` | 확장 (Task Analyzer 기반) |
+
+**신규 개발 필수**: Requirement Graph, Confidence Engine, Escalation Engine, Gap Detector, Need Engine, Self Evaluation, Learning Queue(로직), Knowledge Inventory, Failure/Success Pattern DB, Decision Journal, Skill Tree/Score, Knowledge Decay/Debt, Curiosity/Predictive/Experiment Engine, Research Agent, Goal Success Metrics, Growth Analytics, Organizational/User Memory(확장), Constitution Layer, 캘린더 통합
+
+---
+
+## 4. 전체 아키텍처 (레이어)
+
+```
+Constitution Layer      ── 불변 규칙 (Goal Lock, Permission Learning, Human Override, Sandbox)
+  ↓
+Human Control Layer     ── 권한 3단계 (단순화: 실행 / 학습 제안 / 지식 저장·수정=승인)
+  ↓
+Intelligence Layer      ── Task Analyzer, Requirement Graph, Knowledge Inventory,
+                           Gap Detector, Confidence Engine, Need Engine, Self-Awareness
+  ↓
+Execution Layer         ── Worker Agent, Critic Agent, Debate Loop, Reflection Engine
+  ↓
+Learning Layer          ── Learning Queue, Research Agent, Curiosity, Predictive, Experiment
+  ↓
+Knowledge Layer         ── KB, RAG, (Knowledge Graph: 보류), Provenance, Validation,
+                           Belief Revision, Decay, Debt
+  ↓
+Memory Layer            ── Episodic, Semantic, Long-Term, Organizational, User
+  ↓
+Growth Layer            ── Skill Tree, Skill Score, Failure/Success Pattern DB,
+                           Decision Journal, Growth Analytics
+횡단(Cross-cutting)     ── Anti-Hallucination, Escalation Engine, Goal Success Metrics, KPI
+```
+
+> 권한 체계 단순화 근거: 1인 사용자 환경. 원설계 Level 0~5는 다인 조직용이므로 3단계로 축약하되, 내부 enum은 0~5를 유지해 향후 확장 가능하게 한다.
+
+---
+
+## 5. 모듈 명세 (전체 — 누락 없음)
+
+표기: **[신규]** 새로 개발 / **[확장]** 기존 코드 확장 / **[재사용]** 기존 것 그대로 활용 / **[보류]** 게이트 통과 후 재평가
+
+### Track 0 — 준비 (Phase 0)
+
+| ID | 모듈/작업 | 내용 | 완료 기준 |
+|----|----------|------|----------|
+| 0-1 | 서빙 환경 확정 | LM Studio + Gemma 4 기준 확인 (`src/lmstudio/`) | 모델·엔드포인트 문서화 |
+| 0-2 | 벡터 저장소 확정 | 기존 `embeddings.ts`/`brainIndex.ts` 유지 (교체 안 함) | 결정 기록 |
+| 0-3 | **업무별 골든셋** [신규] | 회의록·시장조사·업무조사 각 5~10건 입력+기대결과물. evalHarness 패턴 재사용 | 골든셋 파일 + 채점 기준 존재 |
+| 0-4 | 데이터 인벤토리 | 과거 회의록·조사 결과물·피드백 소재 파악, 지침서(블로그 v4.1, E-E-A-T, 금지표현 등) 목록화 | 인벤토리 문서 |
+
+### Track 1 — 신뢰성 코어 (Phase 2)
+
+| ID | 모듈 | 책임 | 입력 → 출력 | 완료 기준 |
+|----|------|------|------------|----------|
+| 1-1 | **Confidence Engine** [신규] | 산출물별 확신도 산출 | 답변+근거 → 0~100 점수, 4구간(90+/70~89/50~69/<50) | <50이면 자동 추가조사 트리거 |
+| 1-2 | **Escalation Engine** [신규] | 인간 개입 필요성 판단 | 확신도·영향도·정보부족·규칙충돌 → 검토요청/자체진행 | 저확신+고영향 시 반드시 질문 |
+| 1-3 | **Anti-Hallucination 강화** [확장: coveBlock, postHocSelfCheck] | 모름/추정/확인필요 3분류 강제 | 모든 출력에 근거 등급 라벨 | 근거 없는 단정 출력 0건 (골든셋 기준) |
+| 1-4 | **Provenance 확장** [확장: citationTrace] | 출처·수집일·검증일·신뢰도 메타데이터 | 지식 항목 → `{source, collected_at, validated_at, confidence}` | 임의 결론의 출처 역추적 가능 |
+
+### Track 2 — 업무 품질 (Phase 1) ★최초 착수
+
+| ID | 모듈 | 책임 | 입력 → 출력 | 완료 기준 |
+|----|------|------|------------|----------|
+| 2-1 | **Requirement Graph** [신규] | 업무 유형별 필수 요소 정의 | 업무유형 → 필수요소 체크리스트 | 회의록(참석자/결정사항/액션아이템/담당자/기한), 시장조사(시장규모/성장률/경쟁사/가격/니즈/트렌드), 업무조사(사용자와 정의) 3종 등록 |
+| 2-2 | **Task Analyzer** [확장: intentClarification] | 요청 분석 | 사용자 요청 → 업무유형·성공기준·결과물·제약 | 골든셋 요청의 유형 분류 정확도 ≥90% |
+| 2-3 | **Critic Agent + Debate Loop** [신규+재사용: AgentWorkflowManager] | 제출 전 자동 검수 | 초안 → 비판 → 재작성 → 재검토 | Requirement 누락 시 자체 보완 후 제출 |
+| 2-4 | **Reflection Engine** [신규] | 업무 후 회고 | 완료 업무 → 부족점·원인·필요정보 기록 | 회고 레코드가 Failure Pattern DB에 적재 |
+
+### Track 3 — 성장 루프 (Phase 3) ★검증 게이트
+
+| ID | 모듈 | 책임 | 입력 → 출력 | 완료 기준 |
+|----|------|------|------------|----------|
+| 3-1 | **Knowledge Inventory** [신규] | 보유 지식 파악 | 도메인 → 보유/부족/없음 | 주요 업무 도메인 커버 |
+| 3-2 | **Gap Detector** [신규] | Gap = Requirement − Knowledge | 업무+인벤토리 → 부족지식·영향도·긴급도 | 실제 부족이 Gap으로 검출됨 |
+| 3-3 | **Need Engine** [신규] | 학습 우선순위 산출 | Need = 정보부족×30% + 실패율×25% + 빈도×20% + 확신부족×15% + 피드백×10% | 우선순위 목록 자동 생성 |
+| 3-4 | **Self Evaluation** [신규] | 산출물 자동 채점 | 결과물+골든셋 기준 → 점수(정확성/논리성/가독성/만족도 1~10 + 사실오류 수) | 골든셋과 사람 평가의 상관 확인 |
+| 3-5 | **Learning Queue** [확장: core/queue] | 학습 대기열 | Need 목록 → 우선순위 큐 (예: GA4/High/전환율 분석 실패) | 큐 적재·소비·승인 흐름 동작 |
+| 3-6 | **Failure Pattern DB** [신규] | 반복 실수 추적 | 회고·평가 → 패턴+횟수 (예: 액션아이템 누락 N회) | 동일 실수 재발 시 카운트 증가 + 프롬프트 반영 |
+| 3-7 | **Decision Journal** [신규] | 판단 근거 기록 | 결론·정보선택·검색 → 이유 로그 | 3개월 후 "왜 이렇게 판단했나" 조회 가능 |
+| 3-8 | **Self-Awareness 질의 세트** [신규] | 5질문 내장 | 무엇을/왜 모르나, 영향, 학습 필요·시점 | Gap/Need 산출에 반영 |
+
+> **게이트 G1**: 3-2, 3-3, 3-4, 3-5 (Gap→Need→SelfEval→Queue) 4개를 최소 구현 후 **2주 실사용 검증**. 루프가 실제로 돌면(부족 발견→학습 항목 생성→승인→반영) Phase 4 이후 진행. 흔들리면 보강 후 재검증. 이 게이트 전에는 Track 7(고급 학습)을 시작하지 않는다.
+
+### Track 4 — 지식 운영 (Phase 4)
+
+| ID | 모듈 | 책임 | 완료 기준 |
+|----|------|------|----------|
+| 4-1 | **Knowledge Validation** [신규] | 출처 신뢰도·최신성·적합성·중복·충돌 검증 | 저장 전 검증 통과 필수화 |
+| 4-2 | **Belief Revision** [확장: conflictBlock] | 충돌 시 Add/Update/Retire 결정 | 충돌 지식 자동 분류 + 승인 흐름 |
+| 4-3 | **Knowledge Decay** [신규] | 분야별 감쇠 (기본: AI 30일 / SEO 90일 / 트렌드 180일 — 업무 도메인 주기 재정의) | 미사용·노후·저신뢰 지식 우선순위 자동 하향/보관 |
+| 4-4 | **Knowledge Debt** [신규] | 부족 지식이 막는 업무 수·영향도 관리 | Debt 대시보드 (예: GA4 — Blocked 17, Impact 9) |
+| 4-5 | Knowledge Graph **[보류]** | 지식 관계 관리 | G1 통과 + RAG 안정 후 필요성 재평가하여 착수 여부 결정 |
+
+### Track 5 — 기억·맥락 (Phase 5)
+
+| ID | 모듈 | 책임 | 완료 기준 |
+|----|------|------|----------|
+| 5-1 | **User Memory** [확장: src/memory] | 선호도·피드백·업무 패턴 (원설계 "Digital DNA" 흡수) | "근거 중시, 표 선호" 류 패턴이 산출물에 반영 |
+| 5-2 | **Organizational Memory** [신규] | 업무 프로세스·규칙·문화·선호 방식 | 조직 규칙이 시스템 프롬프트에 자동 주입 |
+| 5-3 | **Episodic 활용 강화** [확장: EpisodicMemory] | 과거 회의록·조사 이력 자동 참조 | 신규 업무 시 관련 과거 업무 자동 인용 |
+| 5-4 | Semantic / Long-Term 정비 [재사용] | 기존 모듈 점검·연결 | 전 레이어와 배선 완료 |
+
+### Track 6 — 일정 관리·도구 통합 (병렬 트랙, Phase 1부터 병행 가능)
+
+| ID | 작업 | 내용 | 완료 기준 |
+|----|------|------|----------|
+| 6-1 | **캘린더 MCP 통합** [신규] | Datacollect Bridge 패턴으로 캘린더 읽기/쓰기 | 일정 조회·등록·변경 동작 |
+| 6-2 | 일정 Requirement Graph | 충돌 감지, 리마인드 규칙, 우선순위 규칙 | 일정 충돌 자동 경고 |
+| 6-3 | 외부 액션 승인 게이트 | 비목표 5(승인 없는 외부 액션 금지) 적용 | 쓰기 작업은 승인 후 실행 |
+
+### Track 7 — 고급 학습·성장 (Phase 6~7, G1 통과 후)
+
+| ID | 모듈 | 책임 | 완료 기준 |
+|----|------|------|----------|
+| 7-1 | **Research Agent** [신규] | 부족 지식 탐색: 검색 계획→수집→요약 | Learning Queue 항목을 자동 조사·요약 |
+| 7-2 | **Skill Tree** [신규] | 역량 트리 (예: SEO ├ Technical ├ Schema ├ Indexing └ CWV) | 주요 도메인 트리 정의 |
+| 7-3 | **Skill Score** [신규] | 역량 점수 0~100 | 평가 결과로 자동 갱신 |
+| 7-4 | **Success Pattern DB** [신규] | 성공 사례 저장·재사용 | 성공 패턴이 신규 업무에 주입 |
+| 7-5 | **Growth Analytics** [신규] | 성장 추적 (예: SEO 52→81) | 기간별 성장 리포트 |
+| 7-6 | **Curiosity Engine** [신규] | 업무 패턴 → 학습 후보 | 후보가 Learning Queue에 제안됨 |
+| 7-7 | **Predictive Learning** [신규] | 미래 수요 예측 선행 학습 (예: MCP/A2A) | 예측 후보 생성 + 승인 흐름 |
+| 7-8 | **Experiment Engine** [신규] | 작업 방식 A/B (예: 검색 5회 vs 10회) | 실험→결과→방식 갱신 1사이클 완료 |
+| 7-9 | **Goal Success Metrics** [신규] | 업무 완료 ≠ 목표 달성 평가 | 산출물이 사용자 목표에 기여했는지 별도 측정 |
+
+### Track 8 — 거버넌스 (횡단, Phase 1부터 점진 적용)
+
+| ID | 모듈 | 내용 | 완료 기준 |
+|----|------|------|----------|
+| 8-1 | **Goal Lock** | 사용자 정의 목표만 수행, 생성·변경·재정의 금지 | 시스템 프롬프트+코드 가드 |
+| 8-2 | **Permission Based Learning** | 지식 저장은 승인 후 | 승인 UI/흐름 동작 |
+| 8-3 | **Human Override** | 중지/삭제/무시/즉시 적용 명령 우선 | 명령 즉시 반영 |
+| 8-4 | **Learning Sandbox** | 학습(검색→Sandbox→검증→승인)과 운영 분리 | 미승인 지식이 운영 응답에 미사용 |
+| 8-5 | 권한 체계 | 3단계 운용 (내부 enum 0~5 유지) | 단계별 동작 차단 확인 |
+
+### Track 9 — 콘텐츠 (minor, Phase 5~6 사이 틈새 처리)
+
+| ID | 작업 | 내용 | 완료 기준 |
+|----|------|------|----------|
+| 9-1 | 지침서 자동 주입 | skillInjectionService에 블로그 지침서 v4.1, 경험담 규칙, E-E-A-T, 금지표현 등록 | 글 생성 시 자동 적용 |
+| 9-2 | 성공 콘텐츠 RAG | 주제→과거 상위 성과 글 검색→패턴 추출→입력 | 파이프라인 동작 |
+| 9-3 | 콘텐츠 검수 | Track 2 Critic 재사용 (작성→검수→수정 3단계) | 검수 통과본만 출력 |
+| 9-4 | 좋은/나쁜 결과물 축적 | 수집→분석→규칙화→프롬프트 반영 | 데이터셋+규칙 문서 |
+| 9-5 | CoT 추론 프롬프트 | 문제분석→가설→검증→최종답변 + Self-check | 추론형 질의에 적용 |
+
+### Track 10 — 엔지니어링 분해 (각 Phase 시작 시 해당 범위만)
+
+44. DB/저장소 스키마 → 45. 에이전트 상태 모델 → 46. 이벤트 모델 → 47. 큐 구조 → 48. API/메시지 명세 → 49. 모듈별 입출력 JSON 스키마
+
+> 전체를 한 번에 설계하지 않고 **Phase 착수 시 그 Phase 범위만** 분해한다 (빅뱅 설계 방지).
+
+---
+
+## 6. 개발 로드맵 (Phase 0~8 + 게이트)
+
+| Phase | 내용 (Track) | 산출물 | 게이트 |
+|-------|------------|--------|--------|
+| **P0** | Track 0 준비 | 골든셋, 인벤토리, 결정 기록 | 골든셋 없이는 P1 채점 불가 |
+| **P1** | Track 2 업무 품질 + Track 8 기본 가드 | Requirement Graph 3종, Task Analyzer, Critic Loop, Reflection | 골든셋 통과율 측정 시작 |
+| **P2** | Track 1 신뢰성 코어 | Confidence, Escalation, Anti-Hallu 강화, Provenance | 저확신 시 질문 동작 |
+| **P3** | Track 3 성장 루프 | Gap/Need/SelfEval/Queue + Failure DB + Journal | **G1: 2주 실사용 루프 검증** |
+| **P4** | Track 4 지식 운영 | Validation, Belief Revision, Decay, Debt (+Graph 재평가) | 지식 충돌·노후 자동 처리 |
+| **P5** | Track 5 기억·맥락 + Track 9 콘텐츠 | User/Org Memory, Episodic 강화, 콘텐츠 파이프라인 | 개인화 반영 확인 |
+| **P6** | Track 7 전반부 | Research Agent, Skill Tree/Score, Success DB, Growth Analytics | 학습→역량 갱신 1사이클 |
+| **P7** | Track 7 후반부 | Curiosity, Predictive, Experiment, Goal Success Metrics | 자기 개선 실험 1건 완료 |
+| **P8** | 통합 (원설계 Phase 9) | Full Self-Evolving 운영: 지속 성장·장기 학습·조직 적응·전문성 강화 | KPI 추세 상승 |
+| **병렬** | Track 6 일정/캘린더 | P1부터 독립 진행 | 승인 게이트 필수 |
+
+원설계 Phase 1~9와의 대응: 원 P1→P0/P1, 원 P2→P1/P3, 원 P3→P2/P3, 원 P4→P1/P3, 원 P5→P3/P4/P6, 원 P6→P6, 원 P7→P7, 원 P8→P5/P2/P7, 원 P9→P8. **누락된 원설계 항목 없음.**
+
+---
+
+## 7. KPI
+
+| 분류 | 지표 |
+|------|------|
+| 업무 | 업무 성공률(골든셋·Requirement 충족률), 사용자 만족도, 재작업률 |
+| 신뢰 | 근거 없는 단정 출력 수, 에스컬레이션 적정률(과소/과다 질문), 출처 역추적 가능률 |
+| 성장 | Skill Score 증가율, Need Accuracy(학습 우선순위의 적중), 실패 패턴 재발률 |
+| 학습 | 학습 후 성능 향상률, 지식 활용률, 잘못된 지식 비율 |
+
+측정 기반: evalHarness + 업무별 골든셋 (P0 산출물). **측정 없는 개선 없음.**
+
+---
+
+## 8. 리스크와 대응
+
+| 리스크 | 대응 |
+|--------|------|
+| 빅뱅 개발로 루프 검증 실패 | 게이트 G1 강제 — 통과 전 Track 7 착수 금지 |
+| 지식 무한 적체 | Knowledge Decay + Debt (Track 4) |
+| 잘못된 지식 오염 | Validation + Provenance + Learning Sandbox |
+| 과다 질문(에스컬레이션 남발)으로 신뢰 하락 | 에스컬레이션 적정률 KPI로 튜닝 |
+| Gemma 4 로컬 성능 한계 | 프롬프트·RAG·검수 3중 보완, 필요 시 작업별 모델 라우팅 검토 |
+| 1인 운영 부담 | 승인 흐름을 배치(일일 승인 큐)로 묶어 처리 |
+
+---
+
+## 9. 진행 현황
+
+- [x] 설계서 v1.0 (사용자 제공)
+- [x] 마스터 계획 v1.1 (본 문서)
+- [x] P0: 회의록 골든셋 8건 (`E:\Wiki\2nd\10_Wiki\Topics\.astra\eval\tasks\meeting-minutes.golden.jsonl`, D:/Meet 전사 기반, reference 는 LLM 초안 — 사용자 검수로 보정 권장) · 템플릿 3종(`assets/eval-templates/tasks/`) — 시장조사·업무조사 골든셋은 미구축
+- [x] P1 (부분): Requirement Graph 4업무(`src/intelligence/requirementGraph.ts`) + 커버리지 hook — Task Analyzer 고도화·Critic Loop·Reflection 미착수
+- [x] P2: Confidence Engine(`confidenceEngine.ts`) / Escalation Engine(`escalationEngine.ts`) / Epistemic Guard(`epistemicGuardBlock.ts`) / Provenance(citationTrace 확장) — 2026-06-11, 테스트 32건
+- [x] P1 잔여: Critic Agent(`criticAgent.ts`, 조건부 1-pass 검수 — 커버리지 누락 또는 확신도<70 인 turn 만 LLM 1회) + Reflection Engine(`reflectionStore.ts`, `<brain>/.astra/growth/reflections.jsonl`) — 2026-06-11
+- [x] P3 (부분): Self Evaluation v1 — Task Eval Harness(`taskEvalHarness.ts`) + 명령 `g1nation.eval.tasks`(회의록 골든셋 자동 채점) + `g1nation.growth.report`(주별 확신도/누락률 추이 + 반복 실수 Top). Failure Pattern v1: 반복 누락 요소(3회+)가 Requirement Graph 블록에 자동 강조 — T5 루프 첫 닫힘
+- [x] P3 완료 (핵심 4 모두 구현, 2026-06-11): Gap Detector(`gapDetector.ts`, 턴별 Requirement−Knowledge), Need Engine(`needEngine.ts`, 설계서 공식 30/25/20/15/10) + Knowledge Inventory v1(보유/부족/없음), Learning Queue(`learningQueue.ts`, proposed 전용 병합 — 승인은 사람만, Permission Based Learning 준수), Decision Journal v1(reflection 의 factors/usedSources 필드). 명령: `g1nation.growth.learningQueue`
+- [ ] **G1 게이트 — 2주 실사용 검증** (현재 위치): 업무 turn 을 실제로 처리하며 ① Reflection 적재 ② 반복 누락 강조 발동 ③ Need 산출 ④ 큐 제안→승인 흐름이 실제로 도는지 확인. 통과 전 P6/P7(고급 학습) 착수 금지
+- [ ] 다음 측정: VS Code 에서 `Astra: 업무 평가 실행` 1회 → 커버리지 baseline 확보 (성장세 그래프의 0점)
+- [x] P4 (2026-06-11): Knowledge Validation + Belief Revision(`knowledgeValidation.ts` — 중복 reject·충돌 시 update/add 권고, 판정만 하고 저장은 승인 흐름; Research Agent P6 배선 대기), Knowledge Decay(`knowledgeDecay.ts` + 명령 `g1nation.knowledge.decayAudit` — 분야별 반감기 감사, 비침습·보고만), Knowledge Debt(needEngine 내 `computeKnowledgeDebt` — learning-needs 리포트에 통합). Knowledge Graph 는 계획대로 보류
+- [x] P5 (부분, 2026-06-11): Organizational Memory(`orgMemoryBlock.ts` — `<brain>/.astra/organization.md` 상시 주입, 파일이 UI). User Memory 는 기존 LongTermMemory 가 담당(추가 개발 불요 판단), Episodic 활용은 기존 5-layer 검색이 커버
+- [x] P6 (부분, 2026-06-11): Research Agent(`researchAgent.ts` + 명령 `g1nation.research.runQueue` — approved 큐 항목 → 조사 브리프(LLM) + 내부 지식 현황(두뇌 검색) + 추정 라벨 초안 + Validation 게이트 → proposals/<id>.md, 상태 in-progress 자동 전환. 외부 근거 수집은 /research·/benchmark 안내 — Bridge 에 범용 검색 API 가 없어 의도적 사람-개입 지점). Skill Score(`skillScore.ts` — 확신도 50%+충족률 30%+비에스컬 20%, 전/후반 추세) + Success Pattern DB(전요소충족+확신도90+ 자동 적재) — 성장 리포트에 통합
+- [ ] P6 잔여: Growth Analytics 고도화(기간 비교 차트), 성공 패턴의 신규 turn 주입(모범 사례 few-shot)
+- [ ] P7: Curiosity / Predictive / Experiment Engine, Goal Success Metrics — **G1 게이트 통과 + reflection 데이터 축적 후** (데이터 없이 만들면 빈 엔진)
+- [x] 병렬: 캘린더 통합 (2026-06-11): 일정 충돌 게이트 — `conflictCheck.ts`(구간/종일 겹침 감지) + 구조화 이벤트 캐시(`calendar_cache.json`, refresh 시 md 와 동시 생성) + `<create_calendar_event>` 액션에 차단 배선(충돌 시 생성 보류·사용자 확인 요청, `force="true"` 는 사용자 승인 후에만). 기보유: Google OAuth·이벤트 생성·ICS 캐시·Tasks API
+- [ ] 콘텐츠 트랙 (minor): 지침서를 `.agent/skills/` 에 등록(사용자 작업), Critic 재사용은 기구현
@@ -46,6 +46,26 @@
        "command": "g1nation.eval.retrieval",
        "title": "Astra: 검색 평가 실행 (recall@k / MRR)"
      },
+      {
+        "command": "g1nation.eval.tasks",
+        "title": "Astra: 업무 평가 실행 (회의록 골든셋)"
+      },
+      {
+        "command": "g1nation.growth.report",
+        "title": "Astra: 성장 리포트 (Reflection 추이)"
+      },
+      {
+        "command": "g1nation.growth.learningQueue",
+        "title": "Astra: 학습 큐 갱신 (Need Engine)"
+      },
+      {
+        "command": "g1nation.knowledge.decayAudit",
+        "title": "Astra: 지식 노후 점검 (Knowledge Decay)"
+      },
+      {
+        "command": "g1nation.research.runQueue",
+        "title": "Astra: 학습 실행 (Research Agent — 승인된 큐 항목)"
+      },
      {
        "command": "g1nation.embeddings.backfill",
        "title": "Astra: 두뇌 임베딩 전체 색인"
@@ -625,6 +645,46 @@
          "default": true,
          "description": "Chain-of-Verification (CoVe) — 답변 *작성 전* 그라운딩 체크리스트를 시스템 프롬프트에 주입해 모델이 self-verify 하도록. 할루시네이션 방지 + 출처 명확화. 기본 켜짐."
        },
+        "g1nation.requirementGraphEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Requirement Graph — 업무 유형(회의록/시장조사/업무조사/일정) 감지 시 필수 요소 체크리스트를 시스템 프롬프트에 주입. 필수 요소 누락 방지. 기본 켜짐."
+        },
+        "g1nation.requirementCoverageEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Requirement Coverage Check — 답변 완료 후 업무 필수 요소 커버리지를 결정론적(정규식)으로 검사, 누락 가능 요소를 footer 한 줄로 표시. LLM 호출 없음. 기본 켜짐."
+        },
+        "g1nation.epistemicGuardEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Epistemic Guard — 모름/추정/확실 3분류를 강제하는 시스템 프롬프트 블록. 검색 근거 없는 turn 에서 단정 금지 + 원자료 역질문 우선. 환각 방지. 기본 켜짐."
+        },
+        "g1nation.confidenceEngineEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Confidence Engine — 답변 확신도 0~100 을 검색 그라운딩·출처 인용·충돌·커버리지 신호로 결정론적 산출, 업무 답변 아래 footer 표시. LLM 호출 없음. 기본 켜짐."
+        },
+        "g1nation.escalationEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Escalation Engine — 확신도 낮음/출처 충돌/조사 출처 누락 시 footer 로 사람 검토를 명시적으로 요청. confidenceEngine 에 종속. 기본 켜짐."
+        },
+        "g1nation.criticLoopEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Critic Loop — 커버리지 누락 또는 확신도<70 인 업무 답변에만 LLM 검수 1회 실행, 발견 이슈와 보완 제안을 footer 카드로 표시. 깨끗한 답변에는 안 돌아 latency 영향 최소. 기본 켜짐."
+        },
+        "g1nation.reflectionEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Reflection — 업무 turn 회고(확신도·누락 요소·에스컬레이션)를 두뇌 .astra/growth/reflections.jsonl 에 기록. 반복 누락 요소는 다음 turn 의 필수 요소 체크리스트에 강조된다 (같은 실수 반복 방지). 기본 켜짐."
+        },
+        "g1nation.orgMemoryEnabled": {
+          "type": "boolean",
+          "default": true,
+          "description": "Organizational Memory — 두뇌 .astra/organization.md 의 조직 규칙·업무 방식·선호를 시스템 프롬프트에 항상 주입. 파일을 직접 편집하면 다음 turn 부터 반영. 파일 없으면 동작 안 함. 기본 켜짐."
+        },
        "g1nation.coveTopSourcesCount": {
          "type": "number",
          "default": 5,
@@ -300,12 +300,15 @@ export class AgentExecutor {
        dynamicBlocks: Map<string, string>;
        /** Self-check 용 — selected chunks 의 (title, content) 요약. memoryContext 가 채움. */
        selfCheckSources: Array<{ title: string; excerpt: string }>;
+        /** Confidence Engine 검색 신호 (Phase 2) — memoryContext 가 채움. */
+        confidenceSignals: import('./intelligence/confidenceEngine').RetrievalConfidenceSignals | null;
    } = {
        retrieval: null,
        lessons: [],
        knowledgeMix: null,
        dynamicBlocks: new Map(),
        selfCheckSources: [],
+        confidenceSignals: null,
    };

    /** Per-turn state 일괄 정리. turn 시작/abort/load session 시 호출. */
@@ -315,6 +318,7 @@ export class AgentExecutor {
        this._turnCtx.knowledgeMix = null;
        this._turnCtx.dynamicBlocks.clear();
        this._turnCtx.selfCheckSources = [];
+        this._turnCtx.confidenceSignals = null;
    }

    private readonly options: AgentExecutorOptions;
@@ -1221,9 +1225,13 @@ export class AgentExecutor {
                    contextLength: ctxLimits.contextLength,
                    engine,
                    selfCheckSources: this._turnCtx.selfCheckSources,
+                    confidenceSignals: this._turnCtx.confidenceSignals,
                    callNonStreaming: (p) => this.callNonStreaming(p),
                    getAbortSignal: () => this.abortController?.signal,
                    getWebview: () => this.webview,
+                    getBrainPath: () => {
+                        try { return getActiveBrainProfile()?.localBrainPath; } catch { return undefined; }
+                    },
                });
            } else {
                this.webview.postMessage({ type: 'streamChunk', value: finalAssistantContent });
@@ -16,6 +16,29 @@ export async function applyCalendarActions(ctx: HandlerContext): Promise<void> {
            report.push(`❌ Calendar Event: title / start 누락`);
            continue;
        }
+        // ── 충돌 게이트 (Self-Evolving OS Track 6-2/6-3) — 기존 일정과 겹치면 생성 보류.
+        // force="true" 는 사용자 확인 후에만 (Constitution: 승인 없는 외부 액션 금지).
+        try {
+            const { readCalendarEventsCache } = await import('../../features/calendar');
+            const { findScheduleConflicts, formatConflictReport } = await import('../../features/calendar/conflictCheck');
+            const existing = readCalendarEventsCache(ctx.context);
+            const conflicts = findScheduleConflicts(existing, {
+                startIso: attrs.start,
+                endIso: attrs.end,
+                durationMinutes: attrs.duration,
+                allDay: attrs.allDay,
+            });
+            if (conflicts.length > 0 && attrs.force !== true) {
+                const msg = formatConflictReport(conflicts);
+                report.push(`⚠️ Calendar Event 보류 — ${attrs.title}: 일정 충돌 ${conflicts.length}건`);
+                ctx.chatHistory.push({
+                    role: 'system',
+                    content: `[Calendar conflict — 생성 보류] "${attrs.title}" (${attrs.start})\n${msg}\n사용자에게 충돌 사실을 알리고 진행 여부를 물을 것.`,
+                    internal: true,
+                });
+                continue;
+            }
+        } catch { /* 충돌 검사 실패가 일정 생성을 막지 않음 — 캐시 없으면 검사 skip */ }
        try {
            const { createCalendarEvent } = await import('../../features/calendar');
            const r = await createCalendarEvent(ctx.context, {
@@ -85,6 +85,8 @@ export function _parseCalEventAttrs(raw: string): {
    duration?: number;
    location?: string;
    allDay?: boolean;
+    /** 충돌 감지 무시하고 강행 — 사용자 확인 후에만 설정해야 함 (conflictCheck). */
+    force?: boolean;
 } {
    const out: any = {};
    // `-` 포함 키 (all-day) 지원 — 일부러 ATTR_RE 와 동일 패턴이지만 매번 fresh
@@ -110,6 +112,9 @@ export function _parseCalEventAttrs(raw: string): {
            case 'all-day':
                out.allDay = val === 'true' || val === '1' || val === 'yes';
                break;
+            case 'force':
+                out.force = val === 'true' || val === '1' || val === 'yes';
+                break;
        }
    }
    return out;
@@ -7,12 +7,22 @@
 *  1. devilRebuttal — Devil Agent 반박 카드 (비활성 시 silent skip)
 *  2. postHocSelfCheck — 답변 검증 LLM 호출 (opt-in, 기본 OFF)
 *  3. termValidator — 결정론적 글로서리 forbidden 검사 (기본 ON)
+ *  4. requirementCoverage — 업무 필수 요소 커버리지 결정론적 검사 (기본 ON)
+ *  5. confidenceEscalation — 확신도 산출 + 인간 검토 요청 + Reflection 기록 (기본 ON)
+ *  6. criticLoop — 결정론적 검사가 문제 신호한 업무 turn 만 LLM 검수 1회 (기본 ON)
 */

 import type { PostAnswerHook, PostAnswerHookContext } from './types';
 import { maybeEmitDevilRebuttal as maybeEmitDevilRebuttalFn } from '../llm/devilRebuttal';
 import { postHocSelfCheck, formatSelfCheckFooter, DEFAULT_SELF_CHECK_OPTIONS } from '../postHocSelfCheck';
 import { validateTermUsage, formatTermValidatorFooter } from '../termValidator';
+import { checkRequirementCoverage, formatRequirementCoverageFooter, detectTaskType } from '../../intelligence/requirementGraph';
+import { extractAnswerSignals, computeConfidence, formatConfidenceFooter } from '../../intelligence/confidenceEngine';
+import { decideEscalation, formatEscalationFooter } from '../../intelligence/escalationEngine';
+import { runCriticReview, formatCriticFooter } from '../../intelligence/criticAgent';
+import { appendReflection } from '../../intelligence/reflectionStore';
+import { detectGaps } from '../../intelligence/gapDetector';
+import { appendSuccessPattern } from '../../intelligence/skillScore';
 import { getConfig } from '../../config';

 const devilRebuttalHook: PostAnswerHook = {
@@ -74,10 +84,147 @@ const termValidatorHook: PostAnswerHook = {
    },
 };

+const requirementCoverageHook: PostAnswerHook = {
+    id: 'requirement-coverage',
+    runAsync: false,
+    run(ctx: PostAnswerHookContext): void {
+        const cfg = getConfig();
+        if (cfg.requirementCoverageEnabled === false) return;
+        if (!ctx.userPrompt.trim() || !ctx.assistantAnswer.trim()) return;
+        const result = checkRequirementCoverage(ctx.userPrompt, ctx.assistantAnswer);
+        const footer = formatRequirementCoverageFooter(result);
+        if (footer) ctx.getWebview()?.postMessage({ type: 'streamChunk', value: footer });
+    },
+};
+
+const confidenceEscalationHook: PostAnswerHook = {
+    id: 'confidence-escalation',
+    runAsync: false,
+    run(ctx: PostAnswerHookContext): void {
+        const cfg = getConfig();
+        if (cfg.confidenceEngineEnabled === false) return;
+        if (!ctx.userPrompt.trim() || !ctx.assistantAnswer.trim()) return;
+
+        // 검색이 안 돈 turn (casual 등) 은 신호 null → 보수적 기본값 (근거 0건).
+        const retrievalSignals = ctx.confidenceSignals ?? {
+            chunkCount: 0, topScore: 0, conflictCount: 0, ambiguityDetected: false,
+        };
+        const coverage = checkRequirementCoverage(ctx.userPrompt, ctx.assistantAnswer);
+        const answerSignals = extractAnswerSignals(
+            ctx.assistantAnswer,
+            coverage.ran ? coverage.missing.length : null,
+        );
+        const confidence = computeConfidence(retrievalSignals, answerSignals);
+
+        // 업무 산출물 turn 에만 footer 표시 — 잡담까지 점수 붙이면 노이즈.
+        // 단, 확신도 '매우 낮음' 은 업무 여부와 무관하게 표시 (T4).
+        const isTask = coverage.ran || coverage.taskId !== undefined;
+        if (!isTask && confidence.band !== 'very-low') return;
+
+        let footer = formatConfidenceFooter(confidence);
+        let escalated = false;
+        if (cfg.escalationEnabled !== false) {
+            const decision = decideEscalation({
+                confidence, coverage, conflictCount: retrievalSignals.conflictCount,
+            });
+            escalated = decision.escalate;
+            footer += formatEscalationFooter(decision);
+        }
+        if (footer) ctx.getWebview()?.postMessage({ type: 'streamChunk', value: footer });
+
+        // ── Reflection 기록 (Track 2-4 / 3-6) — 업무 turn 의 결정론적 회고를
+        // <brain>/.astra/growth/reflections.jsonl 에 적재. 성장 추이·Failure Pattern 의 원천.
+        if (cfg.reflectionEnabled !== false) {
+            const task = detectTaskType(ctx.userPrompt);
+            const brainPath = ctx.getBrainPath?.();
+            if (task && brainPath) {
+                // Gap Detector (Track 3-2) — Requirement − Knowledge. Need Engine 의 입력.
+                const gap = detectGaps({ coverage, signals: retrievalSignals, taskId: task.id });
+                const reflectionRecord = {
+                    ts: new Date().toISOString(),
+                    taskId: task.id,
+                    taskLabel: task.label,
+                    confidenceScore: confidence.score,
+                    confidenceBand: confidence.band,
+                    missing: coverage.ran ? coverage.missing : [],
+                    escalated,
+                    criticIssues: null, // Critic 은 비동기 별도 hook — v1 은 미집계
+                    promptPreview: ctx.userPrompt.replace(/\s+/g, ' ').slice(0, 120),
+                    // Decision Journal v1 (Track 3-7) — 판단 근거 역추적.
+                    factors: confidence.factors.map((f) => `${f.label} (${f.delta > 0 ? '+' : ''}${f.delta})`),
+                    usedSources: (ctx.selfCheckSources || []).map((s) => s.title).slice(0, 5),
+                    // Gap 신호.
+                    retrieval: { chunkCount: retrievalSignals.chunkCount, topScore: retrievalSignals.topScore },
+                    weakGrounding: gap.weakGrounding,
+                    gapSeverity: gap.severity,
+                };
+                appendReflection(brainPath, reflectionRecord);
+                // Success Pattern DB (Track 7-4) — 전 요소 충족 + 확신도 90+ 만 적재.
+                appendSuccessPattern(brainPath, reflectionRecord);
+            }
+        }
+    },
+};
+
+const criticLoopHook: PostAnswerHook = {
+    id: 'critic-loop',
+    runAsync: true,
+    async run(ctx: PostAnswerHookContext): Promise<void> {
+        const cfg = getConfig();
+        if (cfg.criticLoopEnabled === false) return;
+        if (!ctx.userPrompt.trim() || !ctx.assistantAnswer.trim()) return;
+
+        // 게이트 — 결정론적 검사가 문제를 신호한 업무 turn 에만 LLM 검수 1회
+        // (로컬 모델 latency 보호: 깨끗한 답변에는 안 돈다).
+        const task = detectTaskType(ctx.userPrompt);
+        if (!task) return;
+        const coverage = checkRequirementCoverage(ctx.userPrompt, ctx.assistantAnswer);
+        const retrievalSignals = ctx.confidenceSignals ?? {
+            chunkCount: 0, topScore: 0, conflictCount: 0, ambiguityDetected: false,
+        };
+        const answerSignals = extractAnswerSignals(
+            ctx.assistantAnswer,
+            coverage.ran ? coverage.missing.length : null,
+        );
+        const confidence = computeConfidence(retrievalSignals, answerSignals);
+        const needsReview = (coverage.ran && coverage.missing.length > 0) || confidence.score < 70;
+        if (!needsReview) return;
+
+        const critique = await runCriticReview({
+            userPrompt: ctx.userPrompt,
+            draft: ctx.assistantAnswer,
+            requirement: task,
+            missingLabels: coverage.ran ? coverage.missing : [],
+            callLlm: async (system, user, maxTokens) => {
+                const r = await ctx.callNonStreaming({
+                    baseUrl: ctx.baseUrl,
+                    modelName: ctx.modelName,
+                    engine: ctx.engine,
+                    messages: [
+                        { role: 'system', content: system },
+                        { role: 'user', content: user },
+                    ],
+                    temperature: 0.2,
+                    maxTokens,
+                    contextLength: ctx.contextLength,
+                    signal: ctx.getAbortSignal(),
+                });
+                return r.text;
+            },
+        });
+        if (!critique) return; // LLM/파싱 실패 — silent skip, main turn 영향 없음
+        const footer = formatCriticFooter(critique);
+        if (footer) ctx.getWebview()?.postMessage({ type: 'streamChunk', value: footer });
+    },
+};
+
 export const POST_ANSWER_HOOKS: PostAnswerHook[] = [
    devilRebuttalHook,
    postHocSelfCheckHook,
    termValidatorHook,
+    requirementCoverageHook,
+    confidenceEscalationHook,
+    criticLoopHook,
 ];

 /** 모든 hook 을 안전하게 실행 — 한 hook 의 throw 가 다른 hook 막지 않음. */
@@ -27,12 +27,16 @@ export interface PostAnswerHookContext {
    engine: 'lmstudio' | 'ollama';
    /** Self-check 용 출처 미리보기. memoryContext 가 turnCtx 에 채움. */
    selfCheckSources: Array<{ title: string; excerpt: string }>;
+    /** Confidence Engine 검색 신호 (Phase 2). memoryContext 가 채움 — 검색 안 돈 turn 은 null. */
+    confidenceSignals?: import('../../intelligence/confidenceEngine').RetrievalConfidenceSignals | null;
    /** Devil Agent 가 호출 — non-streaming LLM. */
    callNonStreaming: (params: any) => Promise<{ text: string; stopReason?: string }>;
    /** Abort signal accessor. */
    getAbortSignal: () => AbortSignal | undefined;
    /** Webview accessor — hook 결과 streamChunk 송출. vscode.Webview / 간이 Webview 호환. */
    getWebview: () => PostMessageWebview | undefined;
+    /** 활성 두뇌 경로 — Reflection 기록용. 없으면 회고 skip. */
+    getBrainPath?: () => string | undefined;
 }

 export interface PostAnswerHook {
@@ -100,6 +100,50 @@ export interface IAgentConfig {
     * 답변이 좀 더 학술적·verbose 해질 수 있어 기본 off.
     */
    coveStrictMode: boolean;
+    /**
+     * Requirement Graph — 업무 유형(회의록/시장조사/업무조사/일정) 감지 시 필수 요소
+     * 체크리스트를 시스템 프롬프트에 주입. 필수 요소 누락 방지 (신뢰 조건 T3).
+     * 기본 true. (Self-Evolving OS Phase 1 / Track 2-1)
+     */
+    requirementGraphEnabled: boolean;
+    /**
+     * Requirement Coverage Check — 답변 완료 후 필수 요소 커버리지를 결정론적(정규식)으로
+     * 검사, 누락 가능 요소를 footer 로 표시. LLM 호출 없음. 기본 true.
+     */
+    requirementCoverageEnabled: boolean;
+    /**
+     * Epistemic Guard — 모름/추정/확실 3분류 강제 블록. 검색 근거 없는 turn 에서
+     * 단정 금지 + 원자료 역질문 우선 지시. 기본 true. (Phase 2 / Track 1-3)
+     */
+    epistemicGuardEnabled: boolean;
+    /**
+     * Confidence Engine — 답변 확신도 0~100 결정론적 산출, 업무 turn footer 표시.
+     * LLM 호출 없음. 기본 true. (Phase 2 / Track 1-1)
+     */
+    confidenceEngineEnabled: boolean;
+    /**
+     * Escalation Engine — 확신도 낮음/출처 충돌/출처 누락 시 footer 로 인간 검토 요청.
+     * confidenceEngine 에 종속. 기본 true. (Phase 2 / Track 1-2)
+     */
+    escalationEnabled: boolean;
+    /**
+     * Critic Loop — 결정론적 검사(커버리지 누락 또는 확신도<70)가 문제를 신호한 업무
+     * turn 에만 LLM 검수 1회 실행, 발견 이슈·보완 제안을 footer 카드로 표시.
+     * 깨끗한 답변에는 안 돈다 (latency 보호). 기본 true. (Phase 1 / Track 2-3)
+     */
+    criticLoopEnabled: boolean;
+    /**
+     * Reflection — 업무 turn 의 결정론적 회고(확신도·누락 요소·에스컬레이션)를
+     * <brain>/.astra/growth/reflections.jsonl 에 기록. 성장 추이·반복 실수 집계의
+     * 원천이며, 반복 누락 요소는 Requirement Graph 블록에 강조 표시된다.
+     * 기본 true. (Phase 1 / Track 2-4 + Phase 3 / Track 3-6)
+     */
+    reflectionEnabled: boolean;
+    /**
+     * Organizational Memory — <brain>/.astra/organization.md 의 조직 규칙·업무 방식을
+     * 시스템 프롬프트에 항상 주입. 파일 없으면 no-op. 기본 true. (Phase 5 / Track 5-2)
+     */
+    orgMemoryEnabled: boolean;
    /**
     * Actionability — "현재 작업 상태" 신호(최근 슬래시 명령 + 열린 파일) 로 검색 결과
     * 재가중. TF-IDF 매치 점수에 actionability boost 추가해 "지금 작업 중인 컨텍스트" 와
@@ -452,6 +496,14 @@ export function getConfig(): IAgentConfig {
        coveEnabled: cfg.get<boolean>('coveEnabled', true),
        coveTopSourcesCount: Math.max(1, Math.min(15, cfg.get<number>('coveTopSourcesCount', 5))),
        coveStrictMode: cfg.get<boolean>('coveStrictMode', false),
+        requirementGraphEnabled: cfg.get<boolean>('requirementGraphEnabled', true),
+        requirementCoverageEnabled: cfg.get<boolean>('requirementCoverageEnabled', true),
+        epistemicGuardEnabled: cfg.get<boolean>('epistemicGuardEnabled', true),
+        confidenceEngineEnabled: cfg.get<boolean>('confidenceEngineEnabled', true),
+        escalationEnabled: cfg.get<boolean>('escalationEnabled', true),
+        criticLoopEnabled: cfg.get<boolean>('criticLoopEnabled', true),
+        reflectionEnabled: cfg.get<boolean>('reflectionEnabled', true),
+        orgMemoryEnabled: cfg.get<boolean>('orgMemoryEnabled', true),
        actionabilityEnabled: cfg.get<boolean>('actionabilityEnabled', true),
        distillationEnabled: cfg.get<boolean>('distillationEnabled', true),
        distillationAgeThresholdDays: Math.max(1, Math.min(365, cfg.get<number>('distillationAgeThresholdDays', 30))),
@@ -13,6 +13,22 @@ import {
    GOLDEN_TEMPLATE,
    GOLDEN_REL_JSONL,
 } from '../retrieval/evalHarness';
+import {
+    loadTaskGoldenSet,
+    runTaskEval,
+    formatTaskEvalReport,
+    TASK_GOLDEN_DIR,
+} from '../intelligence/taskEvalHarness';
+import { buildRequirementGraphBlock } from '../intelligence/requirementGraph';
+import { buildEpistemicGuardBlock } from '../intelligence/epistemicGuardBlock';
+import { simpleChatCompletion } from '../intelligence/llmCall';
+import { loadReflections, formatGrowthReport } from '../intelligence/reflectionStore';
+import { computeNeeds, knowledgeInventory, computeKnowledgeDebt, formatNeedsMarkdown } from '../intelligence/needEngine';
+import { auditKnowledgeDecay, formatDecayReport } from '../intelligence/knowledgeDecay';
+import { computeSkillScores, formatSkillScoresMarkdown, loadSuccessPatterns, formatSuccessPatternsMarkdown } from '../intelligence/skillScore';
+import { runResearch, formatProposalMarkdown } from '../intelligence/researchAgent';
+import type { ExistingKnowledgeRef } from '../intelligence/knowledgeValidation';
+import { loadQueue, saveQueue, mergeNeedsIntoQueue, formatQueueMarkdown, LEARNING_QUEUE_REL_PATH } from '../intelligence/learningQueue';

 /**
 * 검색 평가 명령 묶음 (Phase 1-나).
@@ -25,6 +41,11 @@ export function registerEvalCommands(): vscode.Disposable[] {
    return [
        vscode.commands.registerCommand('g1nation.eval.retrieval', runRetrievalEvalCommand),
        vscode.commands.registerCommand('g1nation.embeddings.backfill', backfillEmbeddingsCommand),
+        vscode.commands.registerCommand('g1nation.eval.tasks', runTaskEvalCommand),
+        vscode.commands.registerCommand('g1nation.growth.report', growthReportCommand),
+        vscode.commands.registerCommand('g1nation.growth.learningQueue', learningQueueCommand),
+        vscode.commands.registerCommand('g1nation.knowledge.decayAudit', decayAuditCommand),
+        vscode.commands.registerCommand('g1nation.research.runQueue', researchRunQueueCommand),
    ];
 }

@@ -205,6 +226,278 @@ async function backfillEmbeddingsCommand(): Promise<void> {
    }
 }

+/**
+ * 업무 평가 (Self Evaluation v1, Phase 3 / Track 3-4) — 회의록 골든셋의 각 원자료를
+ * LLM 에게 회의록으로 작성시키고 필수 요소 커버리지를 결정론적으로 채점. 같은 골든셋을
+ * 버전마다 돌려 점수 추이로 성장세를 증명한다 (검색 평가와 동일 방법론).
+ */
+async function runTaskEvalCommand(): Promise<void> {
+    try {
+        const brain = getActiveBrainProfile();
+        if (!brain?.localBrainPath || !fs.existsSync(brain.localBrainPath)) {
+            vscode.window.showErrorMessage('활성 두뇌 폴더를 찾을 수 없습니다. 먼저 두뇌를 추가/선택하세요.');
+            return;
+        }
+        const { records, parseErrors, sourcePath } = loadTaskGoldenSet(brain.localBrainPath, 'meeting-minutes');
+        if (records.length === 0) {
+            vscode.window.showWarningMessage(
+                `업무 골든셋이 없습니다: ${path.join(TASK_GOLDEN_DIR, 'meeting-minutes.golden.jsonl')}` +
+                (parseErrors ? ` (파싱 실패 ${parseErrors}줄)` : ''),
+            );
+            return;
+        }
+        const config = getConfig();
+        const model = config.defaultModel;
+        if (!model || !config.ollamaUrl) {
+            vscode.window.showErrorMessage('모델/엔진 설정이 없습니다 (defaultModel, ollamaUrl).');
+            return;
+        }
+
+        await vscode.window.withProgress(
+            { location: vscode.ProgressLocation.Notification, title: 'Astra 업무 평가 (회의록)', cancellable: true },
+            async (progress, token) => {
+                const result = await runTaskEval({
+                    records,
+                    readSource: (sourceFile) => fs.readFileSync(sourceFile, 'utf8'),
+                    generate: async (record, sourceContent) => {
+                        if (token.isCancellationRequested) throw new Error('취소됨');
+                        // 프로덕션과 같은 지시 체계 — Requirement Graph + Epistemic Guard 블록 주입.
+                        const system = [
+                            '너는 업무 비서다. 제공된 회의 전사를 회의록으로 정리한다.',
+                            buildRequirementGraphBlock(record.query),
+                            buildEpistemicGuardBlock({ chunkCount: 1, taskDetected: true }),
+                        ].filter(Boolean).join('\n\n');
+                        const user = `${record.query}\n\n[회의 전사]\n${sourceContent}`;
+                        return simpleChatCompletion(system, user, {
+                            baseUrl: config.ollamaUrl,
+                            model,
+                            temperature: 0.2,
+                            maxTokens: 1600,
+                            timeoutMs: 180000,
+                        });
+                    },
+                    onProgress: (done, total) => progress.report({ message: `${done}/${total} 레코드 평가 중…` }),
+                });
+
+                const now = new Date();
+                const stamp = now.toISOString().replace(/[:.]/g, '-').slice(0, 19);
+                const md = formatTaskEvalReport(result, {
+                    taskLabel: '회의록',
+                    brainName: brain.name,
+                    dateStr: now.toLocaleString(),
+                    modelName: model,
+                    notes: parseErrors ? `골든셋 파싱 실패 ${parseErrors}줄 (무시됨)` : undefined,
+                });
+                const reportPath = path.join(brain.localBrainPath, TASK_GOLDEN_DIR, `report-${stamp}.md`);
+                fs.mkdirSync(path.dirname(reportPath), { recursive: true });
+                fs.writeFileSync(reportPath, md, 'utf8');
+                logInfo('Task eval complete.', { records: result.scores.length, avgCoverage: result.avgCoverage, reportPath });
+
+                const doc = await vscode.workspace.openTextDocument(vscode.Uri.file(reportPath));
+                await vscode.window.showTextDocument(doc, { preview: false });
+                vscode.window.showInformationMessage(
+                    `업무 평가 완료 · 평균 커버리지 ${(result.avgCoverage * 100).toFixed(1)}% · 전 요소 충족 ${result.perfectCount}/${result.scores.length}건 (골든셋: ${path.basename(sourcePath)})`,
+                );
+            },
+        );
+    } catch (err: any) {
+        logError('Task eval command failed.', { error: err?.message || String(err) });
+        vscode.window.showErrorMessage(`업무 평가 실패: ${err?.message ?? err}`);
+    }
+}
+
+/** 성장 리포트 — Reflection 기록(.astra/growth/reflections.jsonl)의 주별 추이 + 반복 실수 Top. */
+async function growthReportCommand(): Promise<void> {
+    try {
+        const brain = getActiveBrainProfile();
+        if (!brain?.localBrainPath || !fs.existsSync(brain.localBrainPath)) {
+            vscode.window.showErrorMessage('활성 두뇌 폴더를 찾을 수 없습니다.');
+            return;
+        }
+        const records = loadReflections(brain.localBrainPath);
+        const md = [
+            formatGrowthReport(records),
+            formatSkillScoresMarkdown(computeSkillScores(records)),
+            formatSuccessPatternsMarkdown(loadSuccessPatterns(brain.localBrainPath)),
+        ].join('\n\n');
+        const reportPath = path.join(brain.localBrainPath, '.astra', 'growth', 'growth-report.md');
+        fs.mkdirSync(path.dirname(reportPath), { recursive: true });
+        fs.writeFileSync(reportPath, md, 'utf8');
+        const doc = await vscode.workspace.openTextDocument(vscode.Uri.file(reportPath));
+        await vscode.window.showTextDocument(doc, { preview: false });
+        if (records.length === 0) {
+            vscode.window.showInformationMessage('아직 Reflection 기록이 없습니다 — 업무(회의록/조사/일정) 요청을 처리하면 자동으로 쌓입니다.');
+        }
+    } catch (err: any) {
+        logError('Growth report command failed.', { error: err?.message || String(err) });
+        vscode.window.showErrorMessage(`성장 리포트 실패: ${err?.message ?? err}`);
+    }
+}
+
+/**
+ * 학습 큐 갱신 (Phase 3 / Track 3-3 + 3-5) — Reflection 기록을 Need Engine 으로 집계해
+ * 학습 우선순위를 산출하고 Learning Queue 에 *proposed* 로 병합한다. 승인(approved)은
+ * 사람이 learning-queue.json 에서 직접 — Permission Based Learning (Constitution 8-2).
+ */
+async function learningQueueCommand(): Promise<void> {
+    try {
+        const brain = getActiveBrainProfile();
+        if (!brain?.localBrainPath || !fs.existsSync(brain.localBrainPath)) {
+            vscode.window.showErrorMessage('활성 두뇌 폴더를 찾을 수 없습니다.');
+            return;
+        }
+        const records = loadReflections(brain.localBrainPath);
+        const needs = computeNeeds(records);
+        const inventory = knowledgeInventory(records);
+        const debt = computeKnowledgeDebt(records);
+
+        const queue = mergeNeedsIntoQueue(loadQueue(brain.localBrainPath), needs, new Date().toISOString());
+        saveQueue(brain.localBrainPath, queue);
+
+        // 사람이 읽는 요약 md — Need 근거 + Inventory + Debt + 큐 현황.
+        const md = [formatNeedsMarkdown(needs, inventory, debt), formatQueueMarkdown(queue)].join('\n---\n\n');
+        const reportPath = path.join(brain.localBrainPath, '.astra', 'growth', 'learning-needs.md');
+        fs.mkdirSync(path.dirname(reportPath), { recursive: true });
+        fs.writeFileSync(reportPath, md, 'utf8');
+
+        const doc = await vscode.workspace.openTextDocument(vscode.Uri.file(reportPath));
+        await vscode.window.showTextDocument(doc, { preview: false });
+        const proposed = queue.filter((q) => q.status === 'proposed').length;
+        vscode.window.showInformationMessage(
+            records.length === 0
+                ? '아직 Reflection 기록이 없습니다 — 업무 turn 이 쌓이면 학습 우선순위가 산출됩니다.'
+                : `학습 큐 갱신 완료 · 제안 ${proposed}건 (승인은 ${LEARNING_QUEUE_REL_PATH} 에서 status 를 approved 로).`,
+        );
+    } catch (err: any) {
+        logError('Learning queue command failed.', { error: err?.message || String(err) });
+        vscode.window.showErrorMessage(`학습 큐 갱신 실패: ${err?.message ?? err}`);
+    }
+}
+
+/**
+ * 지식 노후 점검 (Phase 4 / Track 4-3) — 두뇌 전체 파일의 mtime 을 분야별 반감기로
+ * 감쇠 평가, 노후 지식 보고서를 연다. v1 은 보고만 — 자동 이동/삭제 없음 (Human Override).
+ */
+async function decayAuditCommand(): Promise<void> {
+    try {
+        const brain = getActiveBrainProfile();
+        if (!brain?.localBrainPath || !fs.existsSync(brain.localBrainPath)) {
+            vscode.window.showErrorMessage('활성 두뇌 폴더를 찾을 수 없습니다.');
+            return;
+        }
+        const allFiles = findBrainFiles(brain.localBrainPath);
+        const entries: Array<{ relPath: string; lastUpdated: number }> = [];
+        for (const f of allFiles) {
+            try {
+                const abs = path.isAbsolute(f) ? f : path.join(brain.localBrainPath, f);
+                const st = fs.statSync(abs);
+                entries.push({ relPath: path.relative(brain.localBrainPath, abs) || f, lastUpdated: st.mtimeMs });
+            } catch { /* 파일 사라짐 등 — skip */ }
+        }
+        const items = auditKnowledgeDecay(entries);
+        const md = formatDecayReport(items, { brainName: brain.name, dateStr: new Date().toLocaleString() });
+        const reportPath = path.join(brain.localBrainPath, '.astra', 'growth', 'decay-report.md');
+        fs.mkdirSync(path.dirname(reportPath), { recursive: true });
+        fs.writeFileSync(reportPath, md, 'utf8');
+        const doc = await vscode.workspace.openTextDocument(vscode.Uri.file(reportPath));
+        await vscode.window.showTextDocument(doc, { preview: false });
+        const stale = items.filter((i) => i.status === 'stale').length;
+        vscode.window.showInformationMessage(`지식 노후 점검 완료 · ${entries.length}개 파일 중 노후 ${stale}개.`);
+    } catch (err: any) {
+        logError('Decay audit command failed.', { error: err?.message || String(err) });
+        vscode.window.showErrorMessage(`지식 노후 점검 실패: ${err?.message ?? err}`);
+    }
+}
+
+/**
+ * 학습 실행 (Phase 6 / Track 7-1, Research Agent) — Learning Queue 의 *approved* 항목을
+ * 조사 패키지(브리프 + 내부 현황 + 추정 라벨 초안 + Validation 판정)로 만들어
+ * proposals/ 에 저장하고 상태를 in-progress 로 바꾼다. 두뇌 본문 자동 저장 없음 —
+ * 사람이 외부 근거로 보강·승인해야 지식이 된다 (Permission Based Learning).
+ */
+async function researchRunQueueCommand(): Promise<void> {
+    try {
+        const brain = getActiveBrainProfile();
+        if (!brain?.localBrainPath || !fs.existsSync(brain.localBrainPath)) {
+            vscode.window.showErrorMessage('활성 두뇌 폴더를 찾을 수 없습니다.');
+            return;
+        }
+        const config = getConfig();
+        const model = config.defaultModel;
+        if (!model || !config.ollamaUrl) {
+            vscode.window.showErrorMessage('모델/엔진 설정이 없습니다 (defaultModel, ollamaUrl).');
+            return;
+        }
+        const queue = loadQueue(brain.localBrainPath);
+        const approved = queue.filter((q) => q.status === 'approved');
+        if (approved.length === 0) {
+            vscode.window.showInformationMessage(
+                `승인된 학습 항목이 없습니다 — ${LEARNING_QUEUE_REL_PATH} 에서 status 를 approved 로 바꾼 뒤 다시 실행하세요.`,
+            );
+            return;
+        }
+
+        await vscode.window.withProgress(
+            { location: vscode.ProgressLocation.Notification, title: 'Astra 학습 실행 (Research Agent)', cancellable: true },
+            async (progress, token) => {
+                const orchestrator = new RetrievalOrchestrator();
+                const allFiles = findBrainFiles(brain.localBrainPath);
+                getBrainTokenIndex(brain.localBrainPath, allFiles);
+
+                const fetchInternalRefs = async (topic: string): Promise<ExistingKnowledgeRef[]> => {
+                    const ranked = orchestrator.rankBrainForEval(topic, brain, { limit: 5 }).slice(0, 5);
+                    const refs: ExistingKnowledgeRef[] = [];
+                    for (const r of ranked) {
+                        try {
+                            const abs = path.join(brain.localBrainPath, r.relativePath);
+                            const content = fs.readFileSync(abs, 'utf8').slice(0, 2000);
+                            const st = fs.statSync(abs);
+                            refs.push({ title: path.basename(r.relativePath), content, lastUpdated: st.mtimeMs, filePath: r.relativePath });
+                        } catch { /* skip unreadable */ }
+                    }
+                    return refs;
+                };
+
+                let done = 0;
+                const proposalsDir = path.join(brain.localBrainPath, '.astra', 'growth', 'proposals');
+                fs.mkdirSync(proposalsDir, { recursive: true });
+                const proposalPaths: string[] = [];
+                for (const item of approved) {
+                    if (token.isCancellationRequested) break;
+                    progress.report({ message: `${++done}/${approved.length} — ${item.topic}` });
+                    const pkg = await runResearch({
+                        item,
+                        fetchInternalRefs,
+                        callLlm: (system, user, maxTokens) => simpleChatCompletion(system, user, {
+                            baseUrl: config.ollamaUrl, model, temperature: 0.3, maxTokens, timeoutMs: 180000,
+                        }),
+                        nowIso: new Date().toISOString(),
+                    });
+                    const md = formatProposalMarkdown(pkg, { dateStr: new Date().toLocaleString(), modelName: model });
+                    const filePath = path.join(proposalsDir, `${item.id}.md`);
+                    fs.writeFileSync(filePath, md, 'utf8');
+                    proposalPaths.push(filePath);
+                    item.status = 'in-progress';
+                    item.updatedAt = new Date().toISOString();
+                }
+                saveQueue(brain.localBrainPath, queue);
+                logInfo('Research agent run complete.', { processed: proposalPaths.length });
+
+                if (proposalPaths.length > 0) {
+                    const doc = await vscode.workspace.openTextDocument(vscode.Uri.file(proposalPaths[0]));
+                    await vscode.window.showTextDocument(doc, { preview: false });
+                }
+                vscode.window.showInformationMessage(
+                    `학습 제안 ${proposalPaths.length}건 생성 (.astra/growth/proposals/). 외부 근거로 보강 후 두뇌에 저장하고 큐 상태를 done 으로 바꾸세요.`,
+                );
+            },
+        );
+    } catch (err: any) {
+        logError('Research run command failed.', { error: err?.message || String(err) });
+        vscode.window.showErrorMessage(`학습 실행 실패: ${err?.message ?? err}`);
+    }
+}
+
 /** 골든셋 파일이 없을 때 템플릿을 만든다. 이미 (깨진/빈) 파일이 있으면 덮어쓰지 않는다. */
 async function scaffoldGoldenSet(goldenPath: string, existingSource: string | null, parseErrors: number): Promise<boolean> {
    if (existingSource && fs.existsSync(existingSource)) {
@@ -196,6 +196,15 @@ export async function refreshCalendarCache(context: vscode.ExtensionContext): Pr
    try {
        fs.mkdirSync(path.dirname(cachePath), { recursive: true });
        fs.writeFileSync(cachePath, md, 'utf8');
+        // 구조화 JSON 캐시 — 충돌 감지(conflictCheck)가 사용. md 와 같은 시점·같은 범위.
+        const structured = upcoming.map((e) => ({
+            summary: e.summary,
+            startIso: e.start.toISOString(),
+            endIso: e.end ? e.end.toISOString() : undefined,
+            allDay: e.allDay,
+            location: e.location || undefined,
+        }));
+        fs.writeFileSync(_eventsJsonPath(cachePath), JSON.stringify(structured, null, 2), 'utf8');
    } catch (e: any) {
        return { ok: false, count: 0, error: `캐시 저장 실패: ${e?.message ?? String(e)}`, cachePath };
    }
@@ -215,6 +224,27 @@ export function readCalendarCache(context: vscode.ExtensionContext): string {
    }
 }

+function _eventsJsonPath(mdCachePath: string): string {
+    return mdCachePath.replace(/\.md$/, '.json');
+}
+
+/**
+ * 구조화 이벤트 캐시 읽기 — 충돌 감지용. 캐시 없음/깨짐 → 빈 배열
+ * (충돌 검사가 일정 생성을 막는 false-positive 를 내지 않도록 보수적).
+ */
+export function readCalendarEventsCache(context: vscode.ExtensionContext): Array<{
+    summary: string; startIso: string; endIso?: string; allDay: boolean; location?: string;
+}> {
+    try {
+        const file = _eventsJsonPath(_cachePath(context));
+        if (!fs.existsSync(file)) return [];
+        const arr = JSON.parse(fs.readFileSync(file, 'utf8'));
+        return Array.isArray(arr) ? arr.filter((e: any) => e && typeof e.startIso === 'string') : [];
+    } catch {
+        return [];
+    }
+}
+
 function _renderMarkdown(events: IcsEvent[], daysAhead: number, now: Date): string {
    const tsLabel = (d: Date, allDay: boolean) => {
        const yy = d.getFullYear(), mm = String(d.getMonth() + 1).padStart(2, '0'), dd = String(d.getDate()).padStart(2, '0');
@@ -0,0 +1,81 @@
+/**
+ * Schedule Conflict Check — 일정 생성 전 기존 일정과의 겹침 감지.
+ *
+ * Self-Evolving OS 마스터 플랜 병렬 트랙 6-2 + 6-3. Requirement Graph 의
+ * 일정 필수 요소 "충돌 확인" 과 Constitution "승인 없는 외부 액션 금지" 의
+ * 실행 계층:
+ *   - 에이전트가 <create_calendar_event> 로 일정을 만들기 *전에* ICS 캐시와
+ *     비교해 겹침을 감지
+ *   - 충돌 시 생성을 *차단*하고 사용자 확인을 요청 (force="true" 명시 시에만 강행)
+ *
+ * 순수 모듈 — vscode/네트워크 의존 없음. 캐시 공급은 calendarCache, 차단 배선은
+ * agent/actions/calendar.ts 담당.
+ */
+
+export interface CachedCalEvent {
+    summary: string;
+    /** ISO 문자열 (toISOString 또는 로컬 'YYYY-MM-DDTHH:MM'). */
+    startIso: string;
+    /** 없으면 1시간으로 간주. */
+    endIso?: string;
+    allDay: boolean;
+    location?: string;
+}
+
+export interface CandidateEvent {
+    startIso: string;
+    endIso?: string;
+    /** endIso 없을 때 사용. 기본 60분. */
+    durationMinutes?: number;
+    allDay?: boolean;
+}
+
+const HOUR_MS = 3600000;
+const DAY_MS = 86400000;
+
+function parseIso(iso: string): number | null {
+    const t = Date.parse(iso);
+    return isNaN(t) ? null : t;
+}
+
+/** [start, end) 구간 계산. all-day 는 해당 날짜 00:00~다음날 00:00 (로컬). */
+function rangeOf(startIso: string, endIso: string | undefined, durationMinutes: number | undefined, allDay: boolean): [number, number] | null {
+    const start = parseIso(startIso);
+    if (start === null) return null;
+    if (allDay) {
+        const d = new Date(start);
+        const dayStart = new Date(d.getFullYear(), d.getMonth(), d.getDate()).getTime();
+        return [dayStart, dayStart + DAY_MS];
+    }
+    let end: number | null = endIso ? parseIso(endIso) : null;
+    if (end === null) end = start + (durationMinutes && durationMinutes > 0 ? durationMinutes * 60000 : HOUR_MS);
+    if (end <= start) end = start + HOUR_MS; // 역전 입력 방어
+    return [start, end];
+}
+
+/**
+ * 후보 일정과 겹치는 기존 일정 반환. 파싱 불가능한 입력은 보수적으로 *충돌 없음*
+ * 처리 (잘못된 날짜로 생성 자체가 실패할 것이므로 여기서 막지 않는다).
+ */
+export function findScheduleConflicts(existing: CachedCalEvent[], candidate: CandidateEvent): CachedCalEvent[] {
+    const cand = rangeOf(candidate.startIso, candidate.endIso, candidate.durationMinutes, candidate.allDay === true);
+    if (!cand) return [];
+    const [cs, ce] = cand;
+    const out: CachedCalEvent[] = [];
+    for (const ev of existing || []) {
+        const r = rangeOf(ev.startIso, ev.endIso, undefined, ev.allDay);
+        if (!r) continue;
+        const [es, ee] = r;
+        if (cs < ee && es < ce) out.push(ev); // 구간 겹침
+    }
+    return out;
+}
+
+/** 충돌 보고 텍스트 — 액션 리포트·에이전트 internal 메시지 공용. */
+export function formatConflictReport(conflicts: CachedCalEvent[]): string {
+    const lines = conflicts.slice(0, 5).map((c) => {
+        const when = c.allDay ? `${c.startIso.slice(0, 10)} (종일)` : c.startIso.replace('T', ' ').slice(0, 16);
+        return `- ${c.summary} · ${when}${c.location ? ` · ${c.location}` : ''}`;
+    });
+    return `기존 일정과 겹칩니다:\n${lines.join('\n')}\n생성을 보류했습니다. 그래도 진행하려면 사용자 확인 후 force="true" 로 다시 시도하세요.`;
+}
@@ -11,9 +11,17 @@ export {
    writeCalendarConfig,
    refreshCalendarCache,
    readCalendarCache,
+    readCalendarEventsCache,
    RefreshResult,
 } from './calendarCache';

+export {
+    findScheduleConflicts,
+    formatConflictReport,
+    CachedCalEvent,
+    CandidateEvent,
+} from './conflictCheck';
+
 export {
    runOAuthLoopback,
    refreshAccessToken,
@@ -0,0 +1,165 @@
+/**
+ * Confidence Engine — 답변 확신도 0~100 결정론적 산출.
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 2 / Track 1-1. 신뢰 조건 T4
+ * "확신이 없으면 사람에게 묻는다" 의 측정 기반 — Escalation Engine 의 입력.
+ *
+ * 설계 원칙 (termValidator 와 동일): LLM 호출 없음. 검색 그라운딩 신호(턴 컨텍스트)와
+ * 답변 텍스트 신호(정규식)만으로 산출 — 매 turn 안전 실행, latency 0.
+ *
+ * 점수는 "모델이 얼마나 자신 있나" 가 아니라 "이 답변을 검증 없이 신뢰해도 되는
+ * 근거가 얼마나 갖춰졌나" 를 측정한다. 따라서 모델이 솔직하게 "(확인 필요)" 를
+ * 표시하면 점수가 *내려가는* 것이 올바른 동작 — 사용자 검토를 유도해야 하므로.
+ *
+ * 구간 (설계서 7.5):
+ *   90+    높음        — 그대로 신뢰 가능
+ *   70~89  보통        — 일반 업무 통과
+ *   50~69  낮음        — 업무 산출물이면 검토 권장
+ *   <50    매우 낮음   — 추가 조사 / 인간 검토 필요 (Escalation)
+ */
+
+/** 검색(pre-answer) 신호 — memoryContext 가 turn 마다 채움. */
+export interface RetrievalConfidenceSignals {
+    /** 선택된 검색 청크 수 (brain-trace 제외). */
+    chunkCount: number;
+    /** 최고 청크 score (0~1 정규화). 청크 없으면 0. */
+    topScore: number;
+    /** conflictSeverity 가 NONE 이 아닌 청크 수. */
+    conflictCount: number;
+    /** Intent Clarification 이 모호성을 감지했는가. */
+    ambiguityDetected: boolean;
+}
+
+export interface ConfidenceFactor {
+    /** 점수에 기여한 요인 설명 (footer 표시용). */
+    label: string;
+    /** 기여 점수 (±). */
+    delta: number;
+}
+
+export type ConfidenceBand = 'high' | 'medium' | 'low' | 'very-low';
+
+export interface ConfidenceResult {
+    score: number;            // 0~100
+    band: ConfidenceBand;
+    bandLabel: string;        // 높음/보통/낮음/매우 낮음
+    factors: ConfidenceFactor[];
+}
+
+const BAND_LABELS: Record<ConfidenceBand, string> = {
+    'high': '높음',
+    'medium': '보통',
+    'low': '낮음',
+    'very-low': '매우 낮음',
+};
+
+export function toBand(score: number): ConfidenceBand {
+    if (score >= 90) return 'high';
+    if (score >= 70) return 'medium';
+    if (score >= 50) return 'low';
+    return 'very-low';
+}
+
+/** 답변 텍스트에서 추출하는 신호. */
+export interface AnswerConfidenceSignals {
+    /** 헤지 마커 수 — "(확인 필요)", "추정", "확실하지 않" 등. */
+    hedgeCount: number;
+    /** 답변 끝 출처 라인이 검색 출처를 인용하는가. */
+    hasCitation: boolean;
+    /** 출처 라인이 "모델 지식" 만 표기하는가 (검색 출처 미사용). */
+    modelKnowledgeOnly: boolean;
+    /** Requirement 커버리지 — 검사 안 했으면 null. */
+    coverageMissing: number | null;
+}
+
+const HEDGE_PATTERN = /\(확인 필요\)|\(담당자 미정\)|\(기한 미정\)|추정(?:치|입니다|됩니다)?|확실하지 않|정확하지 않을 수|모르겠|알 수 없/g;
+
+/** 답변 텍스트 → 신호 추출 (결정론적). coverageMissing 은 호출자가 채움. */
+export function extractAnswerSignals(assistantAnswer: string, coverageMissing: number | null): AnswerConfidenceSignals {
+    const text = assistantAnswer || '';
+    const hedges = text.match(HEDGE_PATTERN);
+    const citationLine = /\*?출처:?\*?\s*(.+)/.exec(text);
+    const citationBody = citationLine ? citationLine[1] : '';
+    const modelKnowledgeOnly = /모델 지식/.test(citationBody);
+    return {
+        hedgeCount: hedges ? hedges.length : 0,
+        hasCitation: !!citationLine && !modelKnowledgeOnly,
+        modelKnowledgeOnly,
+        coverageMissing,
+    };
+}
+
+/**
+ * 확신도 산출. 가중치는 휴리스틱 v1 — Phase 3 Self Evaluation 골든셋이 쌓이면
+ * 사람 평가와의 상관으로 보정한다 (KPI: Need Accuracy).
+ */
+export function computeConfidence(
+    retrieval: RetrievalConfidenceSignals,
+    answer: AnswerConfidenceSignals,
+): ConfidenceResult {
+    const factors: ConfidenceFactor[] = [];
+    let score = 55; // 중립 출발점 — 신호가 전무하면 "낮음" 상단
+
+    // ─── 그라운딩 (최대 +25 / -15) ───
+    if (retrieval.chunkCount >= 3 && retrieval.topScore >= 0.5) {
+        factors.push({ label: `검색 근거 ${retrieval.chunkCount}건(강)`, delta: +25 });
+    } else if (retrieval.chunkCount >= 1) {
+        factors.push({ label: `검색 근거 ${retrieval.chunkCount}건`, delta: +12 });
+    } else {
+        factors.push({ label: '검색 근거 없음 (모델 일반 지식)', delta: -15 });
+    }
+
+    // ─── 출처 인용 (+8 / -5) ───
+    if (answer.hasCitation) {
+        factors.push({ label: '출처 인용 있음', delta: +8 });
+    } else if (answer.modelKnowledgeOnly) {
+        factors.push({ label: '모델 지식만 사용 명시', delta: -5 });
+    }
+
+    // ─── 지식 충돌 (건당 -8, 최대 -16) ───
+    if (retrieval.conflictCount > 0) {
+        const d = -Math.min(16, retrieval.conflictCount * 8);
+        factors.push({ label: `출처 간 충돌 ${retrieval.conflictCount}건`, delta: d });
+    }
+
+    // ─── 요청 모호성 (-10) ───
+    if (retrieval.ambiguityDetected) {
+        factors.push({ label: '요청 모호성 감지', delta: -10 });
+    }
+
+    // ─── Requirement 커버리지 (+10 / 누락당 -6, 최대 -18) ───
+    if (answer.coverageMissing !== null) {
+        if (answer.coverageMissing === 0) {
+            factors.push({ label: '필수 요소 전부 충족', delta: +10 });
+        } else {
+            const d = -Math.min(18, answer.coverageMissing * 6);
+            factors.push({ label: `필수 요소 ${answer.coverageMissing}개 누락 가능`, delta: d });
+        }
+    }
+
+    // ─── 헤지 표현 (개당 -4, 최대 -12) — 솔직한 불확실 표시 = 검토 유도 ───
+    if (answer.hedgeCount > 0) {
+        const d = -Math.min(12, answer.hedgeCount * 4);
+        factors.push({ label: `불확실 표시 ${answer.hedgeCount}곳`, delta: d });
+    }
+
+    for (const f of factors) score += f.delta;
+    score = Math.max(0, Math.min(100, Math.round(score)));
+    const band = toBand(score);
+    return { score, band, bandLabel: BAND_LABELS[band], factors };
+}
+
+/**
+ * 확신도 footer 한 줄. 항상 표시 (사용자가 매 답변의 신뢰 수준을 보도록) —
+ * 끄려면 g1nation.confidenceEngineEnabled=false.
+ */
+export function formatConfidenceFooter(result: ConfidenceResult): string {
+    const icon = result.band === 'high' ? '🟢' : result.band === 'medium' ? '🔵' : result.band === 'low' ? '🟡' : '🔴';
+    const top = result.factors
+        .slice()
+        .sort((a, b) => Math.abs(b.delta) - Math.abs(a.delta))
+        .slice(0, 3)
+        .map((f) => f.label)
+        .join(' · ');
+    return `\n\n> ${icon} **확신도 ${result.score}/100 (${result.bandLabel})** — ${top}`;
+}
@@ -0,0 +1,174 @@
+/**
+ * Critic Agent + Debate Loop (v1) — 제출된 업무 산출물의 LLM 검수.
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 1 / Track 2-3. 신뢰 조건 T3 의 LLM 계층:
+ * Requirement Coverage(결정론적, 정규식) 가 "요소가 *언급* 됐는가" 만 보면,
+ * Critic 은 "내용이 *충실* 한가 + 결정/미결 구분이 맞는가 + 근거 없는 단정이
+ * 없는가" 를 본다.
+ *
+ * Debate Loop 원형은 작성→비판→재작성→재검토지만, 로컬 Gemma 의 latency 비용
+ * 때문에 v1 은 *조건부 1-pass 검수* — 결정론적 검사(커버리지 누락 또는 확신도
+ * <70)가 문제를 신호할 때만 Critic LLM 1회 호출, 결과를 답변 아래 보완 카드로
+ * 표시. 전면 다회전 debate 는 config knob(maxRounds) 만 준비해 두고 후속 증분.
+ *
+ * 모든 LLM 의존은 주입(critique caller) — 모듈 자체는 순수, 테스트 가능.
+ */
+
+import type { TaskRequirement } from './requirementGraph';
+
+export interface CriticIssue {
+    severity: 'major' | 'minor';
+    description: string;
+}
+
+export interface CritiqueResult {
+    /** true = 검수 통과 (보완 불필요). */
+    pass: boolean;
+    issues: CriticIssue[];
+    /** 누락 요소를 보완하는 추가 섹션 제안 (Critic 이 생성 가능했을 때만). */
+    supplement: string;
+    /** 디버그용 원문 (파싱 실패 분석). */
+    raw?: string;
+}
+
+/** 주입형 LLM caller — agent.ts 의 callNonStreaming 또는 평가 하니스의 단순 호출. */
+export type CritiqueLlmCall = (system: string, user: string, maxTokens: number) => Promise<string>;
+
+export interface CriticOptions {
+    /** 검수 대상 초안 최대 길이 (chars) — 초과분 잘라서 전달. 기본 12000. */
+    maxDraftChars: number;
+    /** Critic 응답 max tokens. 기본 700. */
+    maxTokens: number;
+}
+
+export const DEFAULT_CRITIC_OPTIONS: CriticOptions = {
+    maxDraftChars: 12000,
+    maxTokens: 700,
+};
+
+export function buildCritiquePrompt(
+    userPrompt: string,
+    draft: string,
+    requirement: TaskRequirement | null,
+    missingLabels: string[],
+    opts: CriticOptions = DEFAULT_CRITIC_OPTIONS,
+): { system: string; user: string } {
+    const reqSection = requirement
+        ? [
+            `업무 유형: ${requirement.label}`,
+            '필수 요소:',
+            ...requirement.elements.map((e) => `- ${e.label}: ${e.hint}`),
+        ].join('\n')
+        : '업무 유형: (미분류)';
+    const missingSection = missingLabels.length > 0
+        ? `결정론적 검사가 누락 가능성을 표시한 요소: ${missingLabels.join(', ')}`
+        : '결정론적 검사 통과 (참고용 재확인)';
+
+    const system = [
+        '너는 업무 산출물 검수자(Critic)다. 동료가 작성한 초안을 비판적으로 검토한다.',
+        '검수 기준:',
+        '1. 필수 요소가 *내용으로* 충실한가 (단어만 등장 ≠ 충족).',
+        '2. 결정사항과 미결(논의만 된 것)이 구분돼 있는가.',
+        '3. 근거 없는 단정·지어낸 수치/이름/날짜가 없는가. 원문에 없는 내용 발견 시 major.',
+        '4. 정보가 없는 항목은 "(확인 필요)" 로 솔직히 표시했는가.',
+        '',
+        '반드시 아래 JSON *만* 출력 (다른 텍스트 금지):',
+        '{"pass": true|false, "issues": [{"severity": "major"|"minor", "description": "..."}], "supplement": "누락 보완 텍스트 (보완 불가능하면 빈 문자열)"}',
+        'supplement 는 초안에 *실제로 추가할 수 있는* 마크다운 섹션만. 원문에 없는 내용을 지어내 보완하는 것은 금지 — 그 경우 "(확인 필요)" 항목으로 작성.',
+    ].join('\n');
+
+    const draftCapped = draft.length > opts.maxDraftChars ? draft.slice(0, opts.maxDraftChars) + '\n…(잘림)' : draft;
+    const user = [
+        `[원래 요청]\n${userPrompt}`,
+        `[검수 기준 컨텍스트]\n${reqSection}\n${missingSection}`,
+        `[검수 대상 초안]\n${draftCapped}`,
+    ].join('\n\n');
+
+    return { system, user };
+}
+
+/** Critic LLM 응답에서 JSON 추출 — 코드펜스/잡설 섞여도 첫 균형 {} 블록을 파싱. */
+export function parseCritique(raw: string): CritiqueResult | null {
+    if (!raw || !raw.trim()) return null;
+    const start = raw.indexOf('{');
+    if (start === -1) return null;
+    // 균형 괄호 스캔 — 중첩 객체(issues 배열 내부) 안전.
+    let depth = 0;
+    let end = -1;
+    let inString = false;
+    let escaped = false;
+    for (let i = start; i < raw.length; i++) {
+        const ch = raw[i];
+        if (escaped) { escaped = false; continue; }
+        if (ch === '\\') { escaped = true; continue; }
+        if (ch === '"') { inString = !inString; continue; }
+        if (inString) continue;
+        if (ch === '{') depth++;
+        else if (ch === '}') {
+            depth--;
+            if (depth === 0) { end = i; break; }
+        }
+    }
+    if (end === -1) return null;
+    try {
+        const obj = JSON.parse(raw.slice(start, end + 1));
+        const issues: CriticIssue[] = Array.isArray(obj.issues)
+            ? obj.issues
+                .filter((i: any) => i && typeof i.description === 'string')
+                .map((i: any) => ({
+                    severity: i.severity === 'major' ? 'major' : 'minor',
+                    description: String(i.description).slice(0, 500),
+                }))
+            : [];
+        return {
+            pass: obj.pass === true && issues.length === 0,
+            issues,
+            supplement: typeof obj.supplement === 'string' ? obj.supplement.slice(0, 4000) : '',
+            raw: raw.slice(0, 200),
+        };
+    } catch {
+        return null;
+    }
+}
+
+/**
+ * Critic 검수 1회 실행. LLM 실패/파싱 실패 시 null — 호출자(hook)는 silent skip
+ * (검수 실패가 main turn 을 막지 않도록).
+ */
+export async function runCriticReview(params: {
+    userPrompt: string;
+    draft: string;
+    requirement: TaskRequirement | null;
+    missingLabels: string[];
+    callLlm: CritiqueLlmCall;
+    options?: Partial<CriticOptions>;
+}): Promise<CritiqueResult | null> {
+    const opts: CriticOptions = { ...DEFAULT_CRITIC_OPTIONS, ...(params.options || {}) };
+    const { system, user } = buildCritiquePrompt(params.userPrompt, params.draft, params.requirement, params.missingLabels, opts);
+    let raw: string;
+    try {
+        raw = await params.callLlm(system, user, opts.maxTokens);
+    } catch {
+        return null;
+    }
+    return parseCritique(raw);
+}
+
+/** 검수 결과 footer — pass 면 빈 문자열 (노이즈 방지). */
+export function formatCriticFooter(critique: CritiqueResult): string {
+    if (critique.pass) return '';
+    const lines: string[] = [];
+    lines.push('\n\n> 🔁 **검수 (Critic)** — 초안에서 발견된 문제:');
+    for (const issue of critique.issues.slice(0, 6)) {
+        const tag = issue.severity === 'major' ? '🔴' : '🟡';
+        lines.push(`> - ${tag} ${issue.description}`);
+    }
+    if (critique.supplement && critique.supplement.trim()) {
+        lines.push('>');
+        lines.push('> **보완 제안:**');
+        for (const l of critique.supplement.trim().split('\n')) {
+            lines.push(`> ${l}`);
+        }
+    }
+    return lines.join('\n');
+}
@@ -0,0 +1,45 @@
+/**
+ * Epistemic Guard — 모름/추정/확실 3분류 강제 시스템 프롬프트 블록.
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 2 / Track 1-3 (Anti-Hallucination Layer).
+ * 신뢰 조건 T1 "모르면 모른다고 말한다" 담당.
+ *
+ * CoVe(coveBlock) 와의 분업:
+ *   - CoVe: 검색 *출처가 있을 때* 주장-출처 매핑을 검증 (그라운딩 점검)
+ *   - Epistemic Guard: 출처 유무와 *무관하게* 모든 주장의 인식론적 등급 표시를 강제,
+ *     특히 검색 근거가 *없는* turn 에서 단정 금지 + 역질문 우선 지시
+ *
+ * 즉 CoVe 가 못 덮는 "검색 결과 0건인데 모델이 그럴듯하게 지어내는" 케이스가
+ * 이 블록의 주 타깃. 검색 근거가 약할수록 지시가 강해진다.
+ */
+
+export interface EpistemicGuardSignals {
+    /** 선택된 검색 청크 수 (brain-trace 제외). */
+    chunkCount: number;
+    /** 업무 유형 감지됨 (Requirement Graph) — 업무 산출물은 더 엄격하게. */
+    taskDetected: boolean;
+}
+
+export function buildEpistemicGuardBlock(signals: EpistemicGuardSignals): string {
+    const lines: string[] = [];
+    lines.push('[EPISTEMIC GUARD]');
+    lines.push('모든 사실성 주장은 다음 3등급 중 하나로 인식하고, 등급이 낮으면 표시할 것:');
+    lines.push('');
+    lines.push('- **확실** — 검색 출처 또는 명백한 사실이 직접 지지. 표시 불필요.');
+    lines.push('- **추정** — 근거가 간접적이거나 일반화. 문장에 "~로 추정", "일반적으로" 명시.');
+    lines.push('- **모름 / 확인 필요** — 근거 없음. *지어내지 말고* "(확인 필요)" 표시 또는 솔직히 모른다고 말할 것.');
+    lines.push('');
+    lines.push('금지: 근거 없는 수치·날짜·고유명사·인용을 사실처럼 제시하는 것. 모름을 인정하는 답변이 그럴듯한 오답보다 항상 낫다.');
+
+    if (signals.chunkCount === 0) {
+        lines.push('');
+        lines.push('⚠️ 이번 턴은 *검색 근거가 없음* — 모델 일반 지식만으로 답하는 상태다.');
+        lines.push('- 구체적 수치·최신 정보·사용자 고유 정보(일정, 과거 회의 등)는 단정하지 말 것.');
+        if (signals.taskDetected) {
+            lines.push('- 업무 산출물 요청인데 근거가 없으므로, 필요한 원자료(회의 메모, 조사 범위 등)를 먼저 *질문*하는 것을 우선 고려할 것.');
+        }
+    }
+
+    lines.push('[/EPISTEMIC GUARD]');
+    return lines.join('\n');
+}
@@ -0,0 +1,74 @@
+/**
+ * Escalation Engine — 인간 개입 필요성 판단.
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 2 / Track 1-2. 신뢰 조건 T4 의 행동 부분:
+ * Confidence Engine 이 "얼마나 확실한가" 를 재면, 이 모듈은 "그래서 사람에게
+ * 물어야 하는가" 를 결정한다.
+ *
+ * 설계서 13장 조건: 확신도 낮음 / 영향도 높음 / 정보 부족 / 규칙 충돌 → 인간 검토.
+ *
+ * v1 은 결정론적 규칙 (LLM 호출 없음). 출력은 답변 아래 footer — 사용자에게
+ * "이 부분은 검토해 달라" 고 명시적으로 요청한다. 외부 액션 차단(승인 게이트)은
+ * Track 6-3 / approvalQueue 영역으로 분리.
+ */
+
+import type { ConfidenceResult } from './confidenceEngine';
+import type { CoverageResult } from './requirementGraph';
+
+export interface EscalationDecision {
+    escalate: boolean;
+    /** 검토 요청 이유 (사용자에게 그대로 표시). */
+    reasons: string[];
+}
+
+export interface EscalationInputs {
+    confidence: ConfidenceResult;
+    /** Requirement 커버리지 결과 (업무 미감지 시 ran=false). */
+    coverage: CoverageResult;
+    /** conflictSeverity != NONE 청크 수. */
+    conflictCount: number;
+}
+
+/** 산출물 신뢰가 특히 중요한 업무 — '보통' 미만이면 검토 요청. */
+const HIGH_IMPACT_TASKS = new Set(['meeting-minutes', 'market-research', 'schedule']);
+
+/** 근거 표시가 필수인 조사 업무에서 '출처' 누락은 단독으로도 에스컬레이션 사유. */
+const SOURCE_REQUIRED_TASKS = new Set(['market-research', 'work-research']);
+
+export function decideEscalation(inputs: EscalationInputs): EscalationDecision {
+    const { confidence, coverage, conflictCount } = inputs;
+    const reasons: string[] = [];
+
+    // 규칙 1 — 확신도 매우 낮음(<50): 업무 유형 무관 무조건 검토.
+    if (confidence.band === 'very-low') {
+        reasons.push(`확신도 매우 낮음 (${confidence.score}/100) — 추가 조사 또는 정보 제공 필요`);
+    }
+
+    // 규칙 2 — 고영향 업무 + 확신도 '보통' 미만(<70).
+    if (
+        confidence.score < 70 &&
+        confidence.band !== 'very-low' && // 규칙 1 과 중복 방지
+        coverage.ran !== false && coverage.taskId && HIGH_IMPACT_TASKS.has(coverage.taskId)
+    ) {
+        reasons.push(`${coverage.taskLabel} 업무인데 확신도 ${confidence.score}/100 — 사용 전 검토 권장`);
+    }
+
+    // 규칙 3 — 조사 업무에서 '출처' 요소 누락: 환각 수치 위험.
+    if (coverage.ran && coverage.taskId && SOURCE_REQUIRED_TASKS.has(coverage.taskId) && coverage.missing.includes('출처')) {
+        reasons.push('조사 결과에 출처 표기가 없음 — 핵심 수치·주장 검증 필요');
+    }
+
+    // 규칙 4 — 출처 간 충돌 + 확신도 90 미만: 어느 쪽을 믿을지 사용자 결정.
+    if (conflictCount > 0 && confidence.score < 90) {
+        reasons.push(`출처 간 충돌 ${conflictCount}건 — 어느 정보를 기준으로 할지 확인 필요`);
+    }
+
+    return { escalate: reasons.length > 0, reasons };
+}
+
+/** 에스컬레이션 footer — 검토 요청 사유 목록. 미해당 시 빈 문자열. */
+export function formatEscalationFooter(decision: EscalationDecision): string {
+    if (!decision.escalate) return '';
+    const lines = decision.reasons.map((r) => `> - ${r}`).join('\n');
+    return `\n\n> 🙋 **검토 요청** — 아래 사유로 사람 확인이 필요합니다:\n${lines}`;
+}
@@ -0,0 +1,73 @@
+/**
+ * Gap Detector — Gap = Requirement − Knowledge (설계서 7.4).
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 3 / Track 3-2. 업무 turn 마다 "필요한 것"
+ * (Requirement Graph 의 필수 요소)과 "가진 것"(검색 그라운딩 + 산출물 커버리지)을
+ * 비교해 부족 지식·영향도·긴급도를 산출한다.
+ *
+ * v1 신호 체계 (결정론적, LLM 없음):
+ *   - 요소 갭: 커버리지 검사에서 누락된 필수 요소
+ *   - 그라운딩 갭: 검색 근거가 없거나(chunkCount=0) 약한(topScore 낮음) 상태에서
+ *     업무를 수행한 것 — "지식이 없어서 모델 일반 지식으로 때운" 신호
+ *   - 영향도: 업무 유형 가중치 (고영향 업무 누락 = 더 심각)
+ *   - 긴급도: 같은 갭의 반복 (Reflection 의 recurrentMisses 와 결합)
+ *
+ * 출력은 Reflection 에 기록되어 Need Engine 의 입력이 된다.
+ */
+
+import type { CoverageResult } from './requirementGraph';
+import type { RetrievalConfidenceSignals } from './confidenceEngine';
+
+export type GapSeverity = 'none' | 'low' | 'medium' | 'high';
+
+export interface GapReport {
+    /** 산출물에서 누락된 필수 요소 (요소 갭). */
+    missingElements: string[];
+    /** 검색 근거 없이/약하게 수행 — 지식 갭 신호. */
+    weakGrounding: boolean;
+    severity: GapSeverity;
+    /** 사람이 읽는 갭 설명 (Need Engine·리포트용). */
+    summary: string;
+}
+
+/** 고영향 업무 — 갭 severity 한 단계 상향. escalationEngine 과 동일 기준. */
+const HIGH_IMPACT_TASKS = new Set(['meeting-minutes', 'market-research', 'schedule']);
+
+const SEVERITY_ORDER: GapSeverity[] = ['none', 'low', 'medium', 'high'];
+
+function bump(s: GapSeverity): GapSeverity {
+    const i = SEVERITY_ORDER.indexOf(s);
+    return SEVERITY_ORDER[Math.min(i + 1, SEVERITY_ORDER.length - 1)];
+}
+
+export function detectGaps(inputs: {
+    coverage: CoverageResult;
+    signals: RetrievalConfidenceSignals;
+    taskId: string | null;
+}): GapReport {
+    const { coverage, signals, taskId } = inputs;
+    const missingElements = coverage.ran ? coverage.missing.slice() : [];
+    const weakGrounding = signals.chunkCount === 0 || (signals.chunkCount > 0 && signals.topScore < 0.3);
+
+    let severity: GapSeverity = 'none';
+    if (missingElements.length >= 3) severity = 'high';
+    else if (missingElements.length > 0) severity = 'medium';
+    else if (weakGrounding) severity = 'low';
+    if (severity !== 'none' && taskId && HIGH_IMPACT_TASKS.has(taskId) && weakGrounding) {
+        severity = bump(severity);
+    }
+
+    const parts: string[] = [];
+    if (missingElements.length > 0) parts.push(`필수 요소 ${missingElements.length}개 누락(${missingElements.join(', ')})`);
+    if (weakGrounding) {
+        parts.push(signals.chunkCount === 0
+            ? '검색 근거 0건 — 모델 일반 지식으로 수행'
+            : `검색 근거 약함 (top score ${signals.topScore.toFixed(2)})`);
+    }
+    return {
+        missingElements,
+        weakGrounding,
+        severity,
+        summary: parts.length > 0 ? parts.join(' · ') : '갭 없음',
+    };
+}
@@ -0,0 +1,110 @@
+/**
+ * Knowledge Decay — 지식 노후 감쇠 점검 (설계서 10장, "인간처럼 잊어버리는 기능").
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 4 / Track 4-3. 분야별 반감기로 지식의
+ * 신선도 계수를 계산하고, 노후 지식을 보고서로 노출한다.
+ *
+ * v1 은 *비침습적 감사* — 검색 랭킹은 건드리지 않는다. RAG 평가 하니스로
+ * 튜닝된 검색 경로를 측정 없이 바꾸지 않기 위함 (decay 를 랭킹에 반영하려면
+ * 골든셋 A/B 로 효과를 증명한 뒤 별도 증분으로). citationTrace 의 Provenance
+ * 표시(180일+ 경고)와 상호 보완.
+ *
+ * 분야 분류는 경로/파일명 키워드 매칭 v1 — 설계서 예시(AI 30일 / SEO 90일 /
+ * 트렌드 180일)에 사용자 업무 도메인 규칙을 추가.
+ */
+
+export interface DecayRule {
+    label: string;
+    /** 경로(상대) 또는 파일명에 매치되는 패턴. */
+    match: RegExp;
+    halfLifeDays: number;
+}
+
+/** 위에서 아래로 첫 매치 적용. 마지막은 catch-all. */
+export const DEFAULT_DECAY_RULES: DecayRule[] = [
+    { label: 'AI/기술', match: /ai|llm|mcp|agent|rag|gpt|claude|gemma|모델|에이전트/iu, halfLifeDays: 30 },
+    { label: 'SEO/마케팅', match: /seo|마케팅|상위노출|키워드/iu, halfLifeDays: 90 },
+    { label: '시장/트렌드', match: /시장|트렌드|동향|경쟁사|market|trend/iu, halfLifeDays: 180 },
+    { label: '회의/프로젝트', match: /회의|meeting|프로젝트|일정/iu, halfLifeDays: 180 },
+    { label: '일반', match: /.*/, halfLifeDays: 365 },
+];
+
+export type DecayStatus = 'active' | 'aging' | 'stale';
+
+export interface DecayItem {
+    relPath: string;
+    category: string;
+    ageDays: number;
+    halfLifeDays: number;
+    /** 0.5^(age/halfLife) — 1.0 신선, 0.5 반감, ↓. */
+    factor: number;
+    status: DecayStatus;
+}
+
+export function classifyDecayRule(relPath: string, rules: DecayRule[] = DEFAULT_DECAY_RULES): DecayRule {
+    for (const rule of rules) if (rule.match.test(relPath)) return rule;
+    return rules[rules.length - 1];
+}
+
+export function decayFactor(lastUpdatedMs: number, halfLifeDays: number, nowMs: number): number {
+    const ageDays = Math.max(0, (nowMs - lastUpdatedMs) / 86400000);
+    return Math.pow(0.5, ageDays / halfLifeDays);
+}
+
+/**
+ * 파일 목록 → 노후 감사. factor ≥0.5 active(반감기 내), ≥0.25 aging(반감 1~2회),
+ * 그 밑 stale(반감 2회+ — 우선 검토 대상).
+ */
+export function auditKnowledgeDecay(
+    files: Array<{ relPath: string; lastUpdated: number }>,
+    options: { rules?: DecayRule[]; nowMs?: number } = {},
+): DecayItem[] {
+    const rules = options.rules ?? DEFAULT_DECAY_RULES;
+    const now = options.nowMs ?? Date.now();
+    const items: DecayItem[] = files.map((f) => {
+        const rule = classifyDecayRule(f.relPath, rules);
+        const factor = decayFactor(f.lastUpdated, rule.halfLifeDays, now);
+        const ageDays = Math.max(0, (now - f.lastUpdated) / 86400000);
+        const status: DecayStatus = factor >= 0.5 ? 'active' : factor >= 0.25 ? 'aging' : 'stale';
+        return {
+            relPath: f.relPath,
+            category: rule.label,
+            ageDays: Math.round(ageDays),
+            halfLifeDays: rule.halfLifeDays,
+            factor,
+            status,
+        };
+    });
+    // stale 우선(낮은 factor 순) — 보고서 상단이 가장 급한 검토 대상.
+    return items.sort((a, b) => a.factor - b.factor);
+}
+
+export function formatDecayReport(items: DecayItem[], meta: { brainName: string; dateStr: string }): string {
+    const lines: string[] = [];
+    lines.push('# 지식 노후 점검 (Knowledge Decay)');
+    lines.push('');
+    lines.push(`- 두뇌: ${meta.brainName} · 일시: ${meta.dateStr}`);
+    lines.push('- 분야별 반감기: AI/기술 30일 · SEO 90일 · 시장/트렌드·회의 180일 · 일반 365일');
+    lines.push('');
+    const counts = { active: 0, aging: 0, stale: 0 } as Record<DecayStatus, number>;
+    for (const it of items) counts[it.status]++;
+    lines.push(`## 요약 — 신선 ${counts.active} · 노화 중 ${counts.aging} · **노후 ${counts.stale}**`);
+    lines.push('');
+    const stale = items.filter((i) => i.status === 'stale').slice(0, 50);
+    if (stale.length === 0) {
+        lines.push('노후(stale) 지식 없음.');
+    } else {
+        lines.push('## 노후 지식 — 갱신/보관/폐기 검토 대상 (factor 낮은 순, 최대 50)');
+        lines.push('');
+        lines.push('| 파일 | 분야 | 경과일 | 반감기 | factor |');
+        lines.push('|---|---|---|---|---|');
+        for (const it of stale) {
+            lines.push(`| ${it.relPath} | ${it.category} | ${it.ageDays} | ${it.halfLifeDays} | ${it.factor.toFixed(2)} |`);
+        }
+        lines.push('');
+        lines.push('> 처리 권고: 여전히 유효하면 파일을 한 번 갱신(저장)해 신선도를 리셋, 낡았으면 보관 폴더로 이동 또는 삭제.');
+        lines.push('> v1 은 보고만 한다 — 자동 이동/삭제 없음 (Human Override 원칙).');
+    }
+    lines.push('');
+    return lines.join('\n');
+}
@@ -0,0 +1,168 @@
+/**
+ * Knowledge Validation + Belief Revision — 지식 저장 전 검증 (설계서 10장).
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 4 / Track 4-1 + 4-2. 새 지식 후보를
+ * 기존 지식과 비교해 수용/검토/거부를 판정하고, 충돌 시 Add/Update/Retire
+ * 권고를 만든다.
+ *
+ * Constitution 준수: 이 모듈은 *판정과 권고만* 한다 — 실제 저장·폐기는 승인
+ * 흐름(Learning Queue / 사용자)을 거친다 (Permission Based Learning).
+ *
+ * v1 은 결정론적 (LLM 없음):
+ *   - 중복: 토큰 Jaccard 유사도 ≥ 0.85 → reject
+ *   - 충돌/관련: 0.35 ≤ 유사도 < 0.85 → review + Belief Revision 권고
+ *     (후보가 더 최신 → update / 기존이 더 최신·불명 → 병존 add 후 사람 판단)
+ *   - 출처 없음 → 자동 수용 금지 (최대 review) — Provenance 원칙
+ *   - 수집일이 오래됨 → review (낡은 지식 유입 방지)
+ *
+ * 사용처: Research Agent (P6) 가 수집한 지식의 저장 게이트. 지금은 라이브러리 +
+ * 테스트로 준비 — Research Agent 배선 시 그대로 연결.
+ */
+
+export interface KnowledgeCandidate {
+    title: string;
+    content: string;
+    /** 출처 (URL/문서명). 없으면 자동 수용 불가. */
+    source?: string;
+    /** 수집 시각 ISO. */
+    collectedAt?: string;
+}
+
+export interface ExistingKnowledgeRef {
+    title: string;
+    content: string;
+    /** epoch ms. */
+    lastUpdated?: number;
+    filePath?: string;
+}
+
+export type ValidationVerdict = 'accept' | 'review' | 'reject';
+export type BeliefRevisionAction = 'add' | 'update' | 'retire-old';
+
+export interface ValidationResult {
+    verdict: ValidationVerdict;
+    checks: {
+        hasSource: boolean;
+        freshness: 'fresh' | 'stale' | 'unknown';
+        /** 중복 판정된 기존 지식 title. */
+        duplicateOf: string | null;
+        /** 충돌/관련 판정된 기존 지식 title. */
+        conflictsWith: string | null;
+        similarity: number;
+    };
+    /** 충돌 시 권고 (검토자에게 표시). 충돌 없으면 'add'. */
+    beliefRevision: BeliefRevisionAction;
+    reasons: string[];
+}
+
+export interface ValidationOptions {
+    /** 이 일수보다 오래 전 수집된 후보는 stale. 기본 365. */
+    staleAfterDays: number;
+    /** 중복 임계 Jaccard. 기본 0.85. */
+    duplicateThreshold: number;
+    /**
+     * 관련/충돌 임계 Jaccard. 기본 0.25 — 한국어는 조사 변형(계산은/계산이) 때문에
+     * 같은 주제 문서도 토큰 Jaccard 가 낮게 나온다. 영어 위주 지식이면 0.35 권장.
+     */
+    conflictThreshold: number;
+    /** 테스트 주입용 현재 시각. */
+    nowMs?: number;
+}
+
+export const DEFAULT_VALIDATION_OPTIONS: ValidationOptions = {
+    staleAfterDays: 365,
+    duplicateThreshold: 0.85,
+    conflictThreshold: 0.25,
+};
+
+/** 공백/문장부호 기준 토큰화 — 한글·영문 공용 v1. */
+function tokenize(text: string): Set<string> {
+    return new Set(
+        (text || '')
+            .toLowerCase()
+            .replace(/[^\w가-힣\s]/gu, ' ')
+            .split(/\s+/)
+            .filter((t) => t.length >= 2),
+    );
+}
+
+export function jaccardSimilarity(a: string, b: string): number {
+    const ta = tokenize(a);
+    const tb = tokenize(b);
+    if (ta.size === 0 || tb.size === 0) return 0;
+    let inter = 0;
+    for (const t of ta) if (tb.has(t)) inter++;
+    return inter / (ta.size + tb.size - inter);
+}
+
+export function validateKnowledgeCandidate(
+    candidate: KnowledgeCandidate,
+    existing: ExistingKnowledgeRef[],
+    options: Partial<ValidationOptions> = {},
+): ValidationResult {
+    const opts: ValidationOptions = { ...DEFAULT_VALIDATION_OPTIONS, ...options };
+    const now = opts.nowMs ?? Date.now();
+    const reasons: string[] = [];
+
+    // ─── 출처 (Provenance) ───
+    const hasSource = !!(candidate.source && candidate.source.trim());
+    if (!hasSource) reasons.push('출처 없음 — 자동 수용 불가');
+
+    // ─── 최신성 ───
+    let freshness: ValidationResult['checks']['freshness'] = 'unknown';
+    if (candidate.collectedAt) {
+        const t = Date.parse(candidate.collectedAt);
+        if (!isNaN(t)) {
+            const ageDays = (now - t) / 86400000;
+            freshness = ageDays > opts.staleAfterDays ? 'stale' : 'fresh';
+            if (freshness === 'stale') reasons.push(`수집일이 ${opts.staleAfterDays}일 이상 경과`);
+        }
+    }
+
+    // ─── 중복·충돌 — 가장 유사한 기존 지식 1건 기준 ───
+    let bestSim = 0;
+    let bestRef: ExistingKnowledgeRef | null = null;
+    for (const ref of existing) {
+        const sim = jaccardSimilarity(candidate.content, ref.content);
+        if (sim > bestSim) { bestSim = sim; bestRef = ref; }
+    }
+
+    let duplicateOf: string | null = null;
+    let conflictsWith: string | null = null;
+    let beliefRevision: BeliefRevisionAction = 'add';
+
+    if (bestRef && bestSim >= opts.duplicateThreshold) {
+        duplicateOf = bestRef.title;
+        reasons.push(`기존 지식과 중복 (유사도 ${bestSim.toFixed(2)}: ${bestRef.title})`);
+    } else if (bestRef && bestSim >= opts.conflictThreshold) {
+        conflictsWith = bestRef.title;
+        // Belief Revision (Track 4-2) — 어느 쪽을 믿을 것인가.
+        const candTime = candidate.collectedAt ? Date.parse(candidate.collectedAt) : NaN;
+        const existTime = bestRef.lastUpdated ?? NaN;
+        if (!isNaN(candTime) && !isNaN(existTime) && candTime > existTime) {
+            beliefRevision = 'update';
+            reasons.push(`기존 지식과 관련/충돌 — 후보가 더 최신 → 갱신(update) 권고, 기존은 폐기(retire) 검토 (${bestRef.title})`);
+        } else {
+            beliefRevision = 'add';
+            reasons.push(`기존 지식과 관련/충돌 — 신선도 우위 불명 → 병존(add) 후 사람 판단 (${bestRef.title})`);
+        }
+    }
+
+    // ─── 종합 판정 ───
+    let verdict: ValidationVerdict;
+    if (duplicateOf) {
+        verdict = 'reject';
+    } else if (!hasSource || freshness === 'stale' || conflictsWith) {
+        verdict = 'review';
+    } else {
+        verdict = 'accept';
+        reasons.push('출처 있음 · 중복/충돌 없음');
+    }
+
+    return {
+        verdict,
+        checks: { hasSource, freshness, duplicateOf, conflictsWith, similarity: bestSim },
+        beliefRevision,
+        reasons,
+    };
+}
@@ -0,0 +1,118 @@
+/**
+ * Learning Queue — 승인 기반 학습 대기열 (설계서 9장).
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 3 / Track 3-5. Need Engine 의 우선순위를
+ * 사람이 승인 가능한 큐로 영속화한다.
+ *
+ * Constitution 준수 (Track 8-2, Permission Based Learning):
+ *   - 시스템은 항목을 *proposed* 로만 추가한다 — 승인은 사람만.
+ *   - 사용자가 파일에서 status 를 approved 로 바꾸면 학습 실행 대상이 된다
+ *     (실행은 Research Agent — Phase 6 후속 증분).
+ *   - mergeNeedsIntoQueue 는 proposed 항목만 갱신하고, 사람이 정한 상태
+ *     (approved/in-progress/done/rejected)는 절대 건드리지 않는다.
+ *
+ * 저장: <brain>/.astra/growth/learning-queue.json (사람이 직접 편집 가능하도록
+ * pretty-print JSON 단일 파일 — md-first ASTRA 철학과 동일한 "파일이 UI" 접근).
+ */
+
+import * as fs from 'fs';
+import * as path from 'path';
+import type { NeedItem } from './needEngine';
+
+export const LEARNING_QUEUE_REL_PATH = path.join('.astra', 'growth', 'learning-queue.json');
+
+export type QueueStatus = 'proposed' | 'approved' | 'in-progress' | 'done' | 'rejected';
+
+export interface QueueItem {
+    /** 안정 키 — 업무유형 기반 (v1). 같은 키는 한 항목으로 유지·갱신. */
+    id: string;
+    topic: string;
+    /** Need Score 0~100 — 갱신 시 최신값으로 교체 (proposed 한정). */
+    priority: number;
+    reason: string;
+    status: QueueStatus;
+    createdAt: string;
+    updatedAt: string;
+}
+
+const VALID_STATUSES: QueueStatus[] = ['proposed', 'approved', 'in-progress', 'done', 'rejected'];
+
+export function loadQueue(brainPath: string): QueueItem[] {
+    try {
+        const file = path.join(brainPath, LEARNING_QUEUE_REL_PATH);
+        if (!fs.existsSync(file)) return [];
+        const arr = JSON.parse(fs.readFileSync(file, 'utf8'));
+        if (!Array.isArray(arr)) return [];
+        return arr.filter((it: any) =>
+            it && typeof it.id === 'string' && VALID_STATUSES.includes(it.status),
+        ) as QueueItem[];
+    } catch {
+        return [];
+    }
+}
+
+export function saveQueue(brainPath: string, queue: QueueItem[]): boolean {
+    try {
+        const file = path.join(brainPath, LEARNING_QUEUE_REL_PATH);
+        fs.mkdirSync(path.dirname(file), { recursive: true });
+        // 우선순위 높은 순 정렬 저장 — 파일을 열면 위가 가장 급한 학습.
+        const sorted = queue.slice().sort((a, b) => b.priority - a.priority);
+        fs.writeFileSync(file, JSON.stringify(sorted, null, 2) + '\n', 'utf8');
+        return true;
+    } catch {
+        return false;
+    }
+}
+
+/**
+ * Need 결과를 큐에 병합.
+ *   - 새 주제 → proposed 로 추가
+ *   - 기존 proposed → priority/reason 최신화
+ *   - 사람이 정한 상태(approved 등) → 변경하지 않음 (Human Override)
+ *   - done/rejected 항목은 Need 가 다시 높아져도 재제안하지 않음 (v1 — 의도적 보수성;
+ *     재제안이 필요하면 사용자가 항목을 지우면 된다)
+ */
+export function mergeNeedsIntoQueue(queue: QueueItem[], needs: NeedItem[], nowIso: string): QueueItem[] {
+    const byId = new Map(queue.map((q) => [q.id, q] as const));
+    for (const need of needs) {
+        const id = `need-${need.taskId}`;
+        const existing = byId.get(id);
+        if (!existing) {
+            byId.set(id, {
+                id,
+                topic: `${need.taskLabel} 역량 보강${need.topMisses.length ? ` (자주 누락: ${need.topMisses.join(', ')})` : ''}`,
+                priority: need.score,
+                reason: need.reason,
+                status: 'proposed',
+                createdAt: nowIso,
+                updatedAt: nowIso,
+            });
+        } else if (existing.status === 'proposed') {
+            existing.priority = need.score;
+            existing.reason = need.reason;
+            existing.updatedAt = nowIso;
+        }
+        // approved/in-progress/done/rejected — 사람이 정한 상태, 불변.
+    }
+    return Array.from(byId.values());
+}
+
+export function formatQueueMarkdown(queue: QueueItem[]): string {
+    const lines: string[] = [];
+    lines.push('# Learning Queue');
+    lines.push('');
+    lines.push('상태 변경은 learning-queue.json 에서 직접: proposed → **approved** (학습 승인) / rejected.');
+    lines.push('approved 항목은 Research Agent(후속 증분)가 처리합니다. 시스템은 proposed 만 추가/갱신합니다.');
+    lines.push('');
+    if (queue.length === 0) {
+        lines.push('큐가 비어 있습니다.');
+        return lines.join('\n');
+    }
+    lines.push('| 우선순위 | 주제 | 상태 | 근거 |');
+    lines.push('|---|---|---|---|');
+    for (const q of queue.slice().sort((a, b) => b.priority - a.priority)) {
+        lines.push(`| ${q.priority} | ${q.topic} | ${q.status} | ${q.reason} |`);
+    }
+    lines.push('');
+    return lines.join('\n');
+}
@@ -0,0 +1,52 @@
+/**
+ * 단순 non-streaming LLM 호출 — Ollama / LM Studio(OpenAI 호환) 듀얼 엔드포인트.
+ *
+ * postHocSelfCheck 의 호출 패턴을 재사용 가능한 헬퍼로 분리. 평가 하니스·Critic 등
+ * AgentExecutor 밖에서 LLM 1회 호출이 필요한 곳이 사용한다 (확장 명령 등).
+ * agent turn 내부에서는 agent.ts 의 callNonStreaming 을 쓸 것 (cloud 라우팅 포함).
+ */
+
+export interface SimpleChatOptions {
+    baseUrl: string;
+    model: string;
+    temperature?: number;
+    maxTokens?: number;
+    timeoutMs?: number;
+}
+
+export async function simpleChatCompletion(
+    system: string,
+    user: string,
+    options: SimpleChatOptions,
+): Promise<string> {
+    const isOllama = options.baseUrl.includes(':11434') || options.baseUrl.includes('ollama');
+    const endpoint = isOllama ? `${options.baseUrl}/api/chat` : `${options.baseUrl}/v1/chat/completions`;
+    const controller = new AbortController();
+    const timer = setTimeout(() => controller.abort(), options.timeoutMs ?? 120000);
+    try {
+        const messages = [
+            { role: 'system', content: system },
+            { role: 'user', content: user },
+        ];
+        const body = isOllama
+            ? { model: options.model, stream: false, messages, options: { temperature: options.temperature ?? 0.2, num_predict: options.maxTokens ?? 1200 } }
+            : { model: options.model, stream: false, temperature: options.temperature ?? 0.2, max_tokens: options.maxTokens ?? 1200, messages };
+        const res = await fetch(endpoint, {
+            method: 'POST',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify(body),
+            signal: controller.signal,
+        });
+        if (!res.ok) throw new Error(`HTTP ${res.status}`);
+        const data: any = await res.json();
+        return String(
+            data?.message?.content ??
+            data?.choices?.[0]?.message?.content ??
+            data?.choices?.[0]?.text ??
+            data?.response ??
+            '',
+        );
+    } finally {
+        clearTimeout(timer);
+    }
+}
@@ -0,0 +1,220 @@
+/**
+ * Need Engine — 학습 필요성 산출 (설계서 7.6) + Knowledge Inventory v1 (7.3).
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 3 / Track 3-3 + 3-1. Reflection 기록을
+ * 집계해 "무엇을 먼저 배워야 하는가" 를 점수로 산출한다 — 성장 루프의 두뇌.
+ *
+ * Need Score (설계서 공식, 0~100):
+ *   정보 부족도 × 30% + 실패율 × 25% + 업무 빈도 × 20% + 확신도 부족 × 15% + 사용자 피드백 × 10%
+ *
+ * v1 신호 매핑 (전부 Reflection 에서 결정론적으로):
+ *   - 정보 부족도: weakGrounding 비율 (검색 근거 없이 수행한 turn 비중)
+ *   - 실패율: 필수 요소 누락이 있었던 turn 비율
+ *   - 업무 빈도: 해당 업무 turn 수 / 전체 업무 turn 수
+ *   - 확신도 부족: (100 − 평균 확신도) / 100
+ *   - 사용자 피드백: v1 미수집 → 0 (필드는 유지, 후속 증분에서 연결)
+ *
+ * 출력은 Learning Queue 의 입력이 된다. 학습 실행은 승인 후 (Permission Based Learning).
+ */
+
+import type { ReflectionRecord } from './reflectionStore';
+
+export interface NeedItem {
+    /** 업무 유형 ID (학습 주제 단위 v1 — 후속: 요소/토픽 단위 세분화). */
+    taskId: string;
+    taskLabel: string;
+    /** 0~100. */
+    score: number;
+    /** 가중치별 기여 내역 (사람이 읽는 근거). */
+    breakdown: {
+        infoLack: number;    // 0~1
+        failRate: number;    // 0~1
+        frequency: number;   // 0~1
+        confidenceLack: number; // 0~1
+        feedback: number;    // 0~1 (v1 = 0)
+    };
+    /** 집계 표본 수. */
+    sampleCount: number;
+    /** 자주 누락된 요소 Top 3 — 학습 주제 구체화용. */
+    topMisses: string[];
+    reason: string;
+}
+
+export const NEED_WEIGHTS = {
+    infoLack: 0.30,
+    failRate: 0.25,
+    frequency: 0.20,
+    confidenceLack: 0.15,
+    feedback: 0.10,
+} as const;
+
+export function computeNeeds(records: ReflectionRecord[]): NeedItem[] {
+    const taskRecords = records.filter((r) => r.taskId);
+    if (taskRecords.length === 0) return [];
+
+    const byTask = new Map<string, ReflectionRecord[]>();
+    for (const r of taskRecords) {
+        const arr = byTask.get(r.taskId!) || [];
+        arr.push(r);
+        byTask.set(r.taskId!, arr);
+    }
+
+    const needs: NeedItem[] = [];
+    for (const [taskId, rs] of byTask) {
+        const infoLack = rs.filter((r) => r.weakGrounding === true).length / rs.length;
+        const failRate = rs.filter((r) => (r.missing || []).length > 0).length / rs.length;
+        const frequency = rs.length / taskRecords.length;
+        const avgConf = rs.reduce((s, r) => s + (r.confidenceScore || 0), 0) / rs.length;
+        const confidenceLack = Math.max(0, Math.min(1, (100 - avgConf) / 100));
+        const feedback = 0; // v1 미수집
+
+        const score = Math.round(100 * (
+            infoLack * NEED_WEIGHTS.infoLack +
+            failRate * NEED_WEIGHTS.failRate +
+            frequency * NEED_WEIGHTS.frequency +
+            confidenceLack * NEED_WEIGHTS.confidenceLack +
+            feedback * NEED_WEIGHTS.feedback
+        ));
+
+        // 자주 누락된 요소 Top 3.
+        const missCounts = new Map<string, number>();
+        for (const r of rs) for (const m of r.missing || []) missCounts.set(m, (missCounts.get(m) || 0) + 1);
+        const topMisses = Array.from(missCounts.entries()).sort((a, b) => b[1] - a[1]).slice(0, 3).map(([m]) => m);
+
+        const reasonParts: string[] = [];
+        if (infoLack > 0.3) reasonParts.push(`근거 없는 수행 ${(infoLack * 100).toFixed(0)}%`);
+        if (failRate > 0.3) reasonParts.push(`요소 누락률 ${(failRate * 100).toFixed(0)}%`);
+        if (confidenceLack > 0.3) reasonParts.push(`평균 확신도 ${avgConf.toFixed(0)}`);
+        if (topMisses.length > 0) reasonParts.push(`자주 누락: ${topMisses.join(', ')}`);
+
+        needs.push({
+            taskId,
+            taskLabel: rs[0].taskLabel || taskId,
+            score,
+            breakdown: { infoLack, failRate, frequency, confidenceLack, feedback },
+            sampleCount: rs.length,
+            topMisses,
+            reason: reasonParts.join(' · ') || '특이 신호 없음 (빈도 기반)',
+        });
+    }
+    return needs.sort((a, b) => b.score - a.score);
+}
+
+/**
+ * Knowledge Inventory v1 (Track 3-1) — 업무 유형별 지식 보유 상태.
+ * 보유/부족/없음 3등급 (설계서 7.3) 을 그라운딩 신호로 판정.
+ */
+export interface InventoryItem {
+    taskId: string;
+    taskLabel: string;
+    /** 'sufficient' | 'partial' | 'missing' */
+    status: 'sufficient' | 'partial' | 'missing';
+    avgChunkCount: number;
+    avgTopScore: number;
+    sampleCount: number;
+}
+
+export function knowledgeInventory(records: ReflectionRecord[]): InventoryItem[] {
+    const withRetrieval = records.filter((r) => r.taskId && r.retrieval);
+    const byTask = new Map<string, ReflectionRecord[]>();
+    for (const r of withRetrieval) {
+        const arr = byTask.get(r.taskId!) || [];
+        arr.push(r);
+        byTask.set(r.taskId!, arr);
+    }
+    const items: InventoryItem[] = [];
+    for (const [taskId, rs] of byTask) {
+        const avgChunkCount = rs.reduce((s, r) => s + (r.retrieval!.chunkCount || 0), 0) / rs.length;
+        const avgTopScore = rs.reduce((s, r) => s + (r.retrieval!.topScore || 0), 0) / rs.length;
+        const status: InventoryItem['status'] =
+            avgChunkCount >= 3 && avgTopScore >= 0.5 ? 'sufficient'
+            : avgChunkCount >= 1 ? 'partial'
+            : 'missing';
+        items.push({ taskId, taskLabel: rs[0].taskLabel || taskId, status, avgChunkCount, avgTopScore, sampleCount: rs.length });
+    }
+    return items.sort((a, b) => a.avgTopScore - b.avgTopScore);
+}
+
+/**
+ * Knowledge Debt (Track 4-4) — 부족 지식이 실제로 막은 업무 집계 (설계서 예:
+ * "GA4 — Blocked Tasks 17, Impact 9"). v1 단위는 업무 유형: 근거 없이/약하게
+ * 수행된 turn 수 = blocked, 그 turn 들의 갭 심각도 평균 = impact (0~10).
+ */
+export interface DebtItem {
+    taskId: string;
+    taskLabel: string;
+    /** 지식 부족 상태로 수행된 업무 turn 수. */
+    blockedTurns: number;
+    /** 평균 갭 심각도 0~10. */
+    impact: number;
+    /** blocked × impact — 정렬 키. */
+    debtScore: number;
+}
+
+const SEVERITY_SCORE: Record<string, number> = { none: 0, low: 3, medium: 6, high: 10 };
+
+export function computeKnowledgeDebt(records: ReflectionRecord[]): DebtItem[] {
+    const blocked = records.filter((r) => r.taskId && r.weakGrounding === true);
+    const byTask = new Map<string, ReflectionRecord[]>();
+    for (const r of blocked) {
+        const arr = byTask.get(r.taskId!) || [];
+        arr.push(r);
+        byTask.set(r.taskId!, arr);
+    }
+    const items: DebtItem[] = [];
+    for (const [taskId, rs] of byTask) {
+        const impact = rs.reduce((s, r) => s + (SEVERITY_SCORE[r.gapSeverity || 'low'] ?? 3), 0) / rs.length;
+        items.push({
+            taskId,
+            taskLabel: rs[0].taskLabel || taskId,
+            blockedTurns: rs.length,
+            impact: Math.round(impact * 10) / 10,
+            debtScore: Math.round(rs.length * impact),
+        });
+    }
+    return items.sort((a, b) => b.debtScore - a.debtScore);
+}
+
+export function formatNeedsMarkdown(needs: NeedItem[], inventory: InventoryItem[], debt: DebtItem[] = []): string {
+    const lines: string[] = [];
+    lines.push('# 학습 필요성 (Need Engine)');
+    lines.push('');
+    lines.push('공식: 정보부족 30% + 실패율 25% + 빈도 20% + 확신부족 15% + 피드백 10%');
+    lines.push('');
+    if (needs.length === 0) {
+        lines.push('Reflection 기록 없음 — 업무 turn 이 쌓이면 학습 우선순위가 산출됩니다.');
+    } else {
+        lines.push('| 우선순위 | 업무 | Need Score | 표본 | 근거 |');
+        lines.push('|---|---|---|---|---|');
+        needs.forEach((n, i) => {
+            lines.push(`| ${i + 1} | ${n.taskLabel} | **${n.score}** | ${n.sampleCount} | ${n.reason} |`);
+        });
+    }
+    lines.push('');
+    lines.push('## Knowledge Inventory (지식 보유 상태)');
+    lines.push('');
+    if (inventory.length === 0) {
+        lines.push('- 데이터 없음');
+    } else {
+        const statusLabel = { sufficient: '보유', partial: '부족', missing: '없음' } as const;
+        lines.push('| 업무 | 상태 | 평균 근거 수 | 평균 top score |');
+        lines.push('|---|---|---|---|');
+        for (const it of inventory) {
+            lines.push(`| ${it.taskLabel} | ${statusLabel[it.status]} | ${it.avgChunkCount.toFixed(1)} | ${it.avgTopScore.toFixed(2)} |`);
+        }
+    }
+    lines.push('');
+    lines.push('## Knowledge Debt (지식 부채)');
+    lines.push('');
+    if (debt.length === 0) {
+        lines.push('- 부채 없음 — 지식 부족 상태로 수행된 업무가 없습니다.');
+    } else {
+        lines.push('| 업무 | Blocked Turns | Impact (0~10) | Debt Score |');
+        lines.push('|---|---|---|---|');
+        for (const d of debt) {
+            lines.push(`| ${d.taskLabel} | ${d.blockedTurns} | ${d.impact} | **${d.debtScore}** |`);
+        }
+    }
+    lines.push('');
+    return lines.join('\n');
+}
@@ -0,0 +1,68 @@
+/**
+ * Organizational Memory — 조직 규칙·프로세스·선호 방식 시스템 프롬프트 블록.
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 5 / Track 5-2 (설계서 11장 Organizational
+ * Memory). "이 회사는 속도 우선, 완벽주의 지양" 류의 조직 문화·업무 방식을
+ * 모든 업무 turn 에 주입한다.
+ *
+ * Terminology Dictionary 와 같은 "파일이 UI" 패턴 — 사용자가
+ * <brain>/.astra/organization.md 를 직접 편집하면 다음 turn 부터 반영.
+ * 파일이 없으면 no-op. (User Memory 는 기존 LongTermMemory 가 담당 — 이 블록은
+ * 검색 score 와 무관하게 *항상* 주입되어야 하는 불변 조직 규칙용.)
+ *
+ * 권장 파일 구조 (자유 형식 markdown):
+ *   ## 업무 방식  / ## 보고 형식  / ## 의사결정 원칙  / ## 금지 사항
+ */
+
+import * as fs from 'fs';
+import * as path from 'path';
+
+export const ORG_MEMORY_REL_PATH = path.join('.astra', 'organization.md');
+
+export interface OrgMemoryBlockOptions {
+    /** 본문 최대 길이 (chars) — 시스템 프롬프트 비대 방지. 기본 3000. */
+    maxBodyLength: number;
+}
+
+export const DEFAULT_ORG_MEMORY_OPTIONS: OrgMemoryBlockOptions = {
+    maxBodyLength: 3000,
+};
+
+/**
+ * 블록 생성 — brainPath 의 organization.md 를 읽어 주입. 파일 없음/읽기 실패 → ''.
+ * mtime 캐시 없이 매 turn 직접 읽음 (파일이 작고, 편집 즉시 반영이 더 중요).
+ */
+export function buildOrgMemoryBlock(brainPath: string, options: Partial<OrgMemoryBlockOptions> = {}): string {
+    const opts: OrgMemoryBlockOptions = { ...DEFAULT_ORG_MEMORY_OPTIONS, ...options };
+    let raw = '';
+    try {
+        const file = path.join(brainPath, ORG_MEMORY_REL_PATH);
+        if (!fs.existsSync(file)) return '';
+        raw = fs.readFileSync(file, 'utf8').trim();
+    } catch {
+        return '';
+    }
+    if (!raw) return '';
+
+    let body = raw;
+    let truncated = false;
+    if (body.length > opts.maxBodyLength) {
+        body = body.slice(0, opts.maxBodyLength);
+        truncated = true;
+    }
+
+    const lines: string[] = [];
+    lines.push('[ORGANIZATIONAL MEMORY]');
+    lines.push('아래는 이 조직의 업무 방식·규칙·선호다. 업무 산출물(회의록/조사/일정)은 이 방식을 *항상* 따를 것.');
+    lines.push('사용자 명시 지시와 충돌하면 사용자 지시 우선 (Human Override).');
+    lines.push('');
+    lines.push('---');
+    lines.push(body);
+    if (truncated) {
+        lines.push('');
+        lines.push(`_…(${raw.length - opts.maxBodyLength}자 잘림 — 핵심 규칙을 앞쪽에 배치해 주세요)_`);
+    }
+    lines.push('---');
+    lines.push('[/ORGANIZATIONAL MEMORY]');
+    return lines.join('\n');
+}
@@ -0,0 +1,162 @@
+/**
+ * Reflection Store — 업무 turn 회고 기록 + Failure Pattern 집계.
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 1 / Track 2-4 (Reflection Engine v1) +
+ * Phase 3 / Track 3-6 (Failure Pattern DB v1 시드). 신뢰 조건 T5
+ * "같은 실수를 반복하지 않는다" 의 데이터 기반.
+ *
+ * v1 은 결정론적 신호만 기록 (LLM 회고 질문은 후속 증분):
+ *   업무 turn 종료 → {업무유형, 확신도, 누락 요소, 에스컬레이션 여부, Critic 이슈 수}
+ *   를 <brain>/.astra/growth/reflections.jsonl 에 append.
+ *
+ * 이 파일이 쌓이면:
+ *   - summarizeFailurePatterns() → "회의록·기한 누락 N회" 류 반복 실수 집계
+ *   - formatGrowthReport() → 기간별 확신도/누락률 추이 = *성장세 그래프의 원천*
+ */
+
+import * as fs from 'fs';
+import * as path from 'path';
+
+export const REFLECTIONS_REL_PATH = path.join('.astra', 'growth', 'reflections.jsonl');
+
+export interface ReflectionRecord {
+    /** ISO timestamp. */
+    ts: string;
+    taskId: string | null;
+    taskLabel: string | null;
+    confidenceScore: number;
+    confidenceBand: string;
+    /** 커버리지 누락 요소 label 목록. */
+    missing: string[];
+    escalated: boolean;
+    /** Critic 검수가 돌았으면 발견 이슈 수, 안 돌았으면 null. */
+    criticIssues: number | null;
+    /** 요청 미리보기 (디버그·회고용, 120자). */
+    promptPreview: string;
+
+    // ── Decision Journal v1 (Track 3-7) — "왜 이 확신도/판단이었나" 역추적 필드 ──
+    /** 확신도 기여 요인 label 목록 (confidenceEngine factors). */
+    factors?: string[];
+    /** 답변에 쓰인 상위 출처 title (citation/selfCheckSources 기준). */
+    usedSources?: string[];
+
+    // ── Gap Detector v1 (Track 3-2) — Need Engine 입력 신호 ──
+    /** 검색 그라운딩: 청크 수·최고 score. */
+    retrieval?: { chunkCount: number; topScore: number };
+    /** 검색 근거 없이/약하게 수행한 업무 turn (지식 갭 신호). */
+    weakGrounding?: boolean;
+    /** 갭 심각도 (none/low/medium/high). */
+    gapSeverity?: string;
+}
+
+/** 회고 1건 append — 실패해도 throw 하지 않음 (회고가 turn 을 막으면 안 됨). */
+export function appendReflection(brainPath: string, record: ReflectionRecord): boolean {
+    try {
+        if (!brainPath) return false;
+        const file = path.join(brainPath, REFLECTIONS_REL_PATH);
+        fs.mkdirSync(path.dirname(file), { recursive: true });
+        fs.appendFileSync(file, JSON.stringify(record) + '\n', 'utf8');
+        return true;
+    } catch {
+        return false;
+    }
+}
+
+/** 회고 로드 — 깨진 줄은 무시. limit 은 *최근* N건. */
+export function loadReflections(brainPath: string, limit?: number): ReflectionRecord[] {
+    try {
+        const file = path.join(brainPath, REFLECTIONS_REL_PATH);
+        if (!fs.existsSync(file)) return [];
+        const lines = fs.readFileSync(file, 'utf8').split('\n').filter((l) => l.trim());
+        const records: ReflectionRecord[] = [];
+        for (const line of lines) {
+            try {
+                const obj = JSON.parse(line);
+                if (obj && typeof obj.ts === 'string') records.push(obj as ReflectionRecord);
+            } catch { /* skip broken line */ }
+        }
+        return limit && limit > 0 ? records.slice(-limit) : records;
+    } catch {
+        return [];
+    }
+}
+
+export interface FailurePattern {
+    taskId: string;
+    taskLabel: string;
+    element: string;
+    count: number;
+}
+
+/**
+ * Failure Pattern 집계 — (업무유형 × 누락 요소) 별 반복 횟수, 많은 순.
+ * "시장규모 누락 27회" 류의 반복 실수를 수치로 노출 (설계서 12장).
+ */
+export function summarizeFailurePatterns(records: ReflectionRecord[]): FailurePattern[] {
+    const counts = new Map<string, FailurePattern>();
+    for (const r of records) {
+        if (!r.taskId) continue;
+        for (const el of r.missing || []) {
+            const key = `${r.taskId}::${el}`;
+            const cur = counts.get(key);
+            if (cur) cur.count++;
+            else counts.set(key, { taskId: r.taskId, taskLabel: r.taskLabel || r.taskId, element: el, count: 1 });
+        }
+    }
+    return Array.from(counts.values()).sort((a, b) => b.count - a.count);
+}
+
+/**
+ * 반복 실수 경고 — 같은 (업무 × 요소) 누락이 threshold 회 이상이면 해당 요소를
+ * 시스템 프롬프트 강조 대상으로 반환. Requirement Graph 블록이 이걸 받아
+ * "특히 자주 누락되는 요소" 로 표시 (T5 루프의 첫 닫힘).
+ */
+export function recurrentMisses(records: ReflectionRecord[], taskId: string, threshold = 3): string[] {
+    return summarizeFailurePatterns(records)
+        .filter((p) => p.taskId === taskId && p.count >= threshold)
+        .map((p) => p.element);
+}
+
+/** 기간(주) 단위 성장 리포트 — 확신도 평균·누락률 추이. */
+export function formatGrowthReport(records: ReflectionRecord[]): string {
+    if (records.length === 0) return '# 성장 리포트\n\n기록 없음 — 업무 turn 이 쌓이면 추이가 표시됩니다.\n';
+
+    // 주 단위 버킷 (ISO week 근사 — ts 앞 10자의 날짜 기준 7일 묶음).
+    const byWeek = new Map<string, ReflectionRecord[]>();
+    for (const r of records) {
+        const d = new Date(r.ts);
+        if (isNaN(d.getTime())) continue;
+        const weekStart = new Date(d);
+        weekStart.setDate(d.getDate() - d.getDay()); // 일요일 기준
+        const key = weekStart.toISOString().slice(0, 10);
+        const arr = byWeek.get(key) || [];
+        arr.push(r);
+        byWeek.set(key, arr);
+    }
+
+    const lines: string[] = [];
+    lines.push('# ASTRA 성장 리포트 (Reflection 기반)');
+    lines.push('');
+    lines.push(`총 업무 turn: ${records.length}`);
+    lines.push('');
+    lines.push('| 주 (시작일) | 업무 수 | 평균 확신도 | 요소 누락률 | 에스컬레이션 |');
+    lines.push('|---|---|---|---|---|');
+    const weeks = Array.from(byWeek.keys()).sort();
+    for (const w of weeks) {
+        const rs = byWeek.get(w)!;
+        const avgConf = rs.reduce((s, r) => s + (r.confidenceScore || 0), 0) / rs.length;
+        const missRate = rs.filter((r) => (r.missing || []).length > 0).length / rs.length;
+        const escCount = rs.filter((r) => r.escalated).length;
+        lines.push(`| ${w} | ${rs.length} | ${avgConf.toFixed(0)} | ${(missRate * 100).toFixed(0)}% | ${escCount} |`);
+    }
+    lines.push('');
+    lines.push('## 반복 실수 Top (Failure Patterns)');
+    const patterns = summarizeFailurePatterns(records).slice(0, 10);
+    if (patterns.length === 0) {
+        lines.push('- 없음');
+    } else {
+        for (const p of patterns) lines.push(`- ${p.taskLabel} · **${p.element}** 누락 ${p.count}회`);
+    }
+    lines.push('');
+    return lines.join('\n');
+}
@@ -0,0 +1,273 @@
+/**
+ * Requirement Graph — 업무 유형별 필수 요소 정의 + 감지 + 커버리지 검사.
+ *
+ * Self-Evolving Digital Employee OS 마스터 플랜(docs/SELF_EVOLVING_OS_MASTER_PLAN.md)
+ * Phase 1 / Track 2-1. 신뢰 조건 T3 "품질이 일관적이다 — 필수 요소 누락 없음" 담당.
+ *
+ * 동작 2단계:
+ *   1. *Instructional* — 사용자 요청에서 업무 유형(회의록/시장조사/업무조사/일정) 감지 시
+ *      [TASK REQUIREMENTS] 블록을 시스템 프롬프트에 주입 → 모델이 필수 요소를 빠짐없이 작성.
+ *      정보가 없어 채울 수 없는 요소는 "(확인 필요)" 로 명시하게 강제 — 조용한 생략 금지
+ *      (Anti-Hallucination T1 과 연결).
+ *   2. *Deterministic* — 답변 완료 후 post-answer hook 이 필수 요소 커버리지를 정규식으로
+ *      스캔, 누락 가능 요소를 footer 로 표시 (termValidator 와 같은 패턴, LLM 호출 없음).
+ *
+ * Gap Detector (Phase 3) 가 이 모듈의 Requirement 정의를 입력으로 사용한다:
+ * Gap = Requirement − Knowledge.
+ */
+
+export interface RequirementElement {
+    /** 안정적 식별자 (Failure Pattern DB 가 누락 카운트 키로 사용 예정). */
+    id: string;
+    /** 사람이 읽는 요소명 — 블록·footer 에 표시. */
+    label: string;
+    /** 모델에게 주는 작성 힌트. */
+    hint: string;
+    /** 커버리지 검사용 정규식 소스 (OR 결합, i+u 플래그). */
+    detectPatterns: string[];
+}
+
+export interface TaskRequirement {
+    /** 업무 유형 ID (예: 'meeting-minutes'). */
+    id: string;
+    /** 사람이 읽는 업무명 (예: '회의록'). */
+    label: string;
+    /** 사용자 요청에서 업무 유형을 감지하는 정규식 소스 (OR). */
+    detectKeywords: string[];
+    /**
+     * 답변 커버리지 검사 여부. 일정 등 짧은 확인형 응답이 정상인 업무는 false —
+     * footer 노이즈(false-positive) 방지. 블록 주입은 항상 수행.
+     */
+    coverageCheck: boolean;
+    elements: RequirementElement[];
+}
+
+export interface CoverageResult {
+    ran: boolean;
+    taskId?: string;
+    taskLabel?: string;
+    covered: string[];   // element labels
+    missing: string[];   // element labels
+}
+
+/**
+ * 기본 업무 정의 4종. 배열 순서 = 감지 우선순위 (구체적 유형 먼저, 범용 '업무조사' 마지막 —
+ * "조사" 류 키워드가 시장조사를 가로채지 않도록).
+ */
+export const DEFAULT_TASK_REQUIREMENTS: TaskRequirement[] = [
+    {
+        id: 'meeting-minutes',
+        label: '회의록',
+        detectKeywords: ['회의록', '회의 ?(내용|결과)? ?정리', '미팅 ?(노트|정리)', 'meeting (minutes|notes)'],
+        coverageCheck: true,
+        elements: [
+            {
+                id: 'attendees', label: '참석자',
+                hint: '회의 참석 인원 전원. 불명확하면 "(확인 필요)".',
+                detectPatterns: ['참석자', '참석 ?인원', 'attendees?'],
+            },
+            {
+                id: 'decisions', label: '결정사항',
+                hint: '회의에서 합의·확정된 사항. 논의만 되고 미결인 항목과 구분.',
+                detectPatterns: ['결정 ?사항', '결정된', '합의', '확정', 'decisions?'],
+            },
+            {
+                id: 'action-items', label: '액션 아이템',
+                hint: '후속 실행 항목. 각 항목에 담당자·기한 연결.',
+                detectPatterns: ['액션 ?아이템', 'action ?items?', '할 ?일', '후속 ?(조치|작업)', 'to-?do'],
+            },
+            {
+                id: 'owners', label: '담당자',
+                hint: '액션 아이템별 책임자. 미정이면 "(담당자 미정)" 명시.',
+                detectPatterns: ['담당자?', '책임자', 'owner'],
+            },
+            {
+                id: 'due-dates', label: '기한',
+                hint: '액션 아이템별 마감일. 미정이면 "(기한 미정)" 명시.',
+                detectPatterns: ['기한', '마감', '까지', 'due', '\\d{1,2}\\s*월\\s*\\d{1,2}\\s*일'],
+            },
+        ],
+    },
+    {
+        id: 'market-research',
+        label: '시장조사',
+        detectKeywords: ['시장 ?조사', '시장 ?분석', '시장 ?(규모|동향|현황)', 'market (research|analysis)'],
+        coverageCheck: true,
+        elements: [
+            {
+                id: 'market-size', label: '시장 규모',
+                hint: '금액/수량 기준 규모. 수치 출처 필수, 없으면 "(확인 필요)".',
+                detectPatterns: ['시장 ?규모', 'market ?size', '\\d+\\s*(억|조|만\\s*달러|billion|million)'],
+            },
+            {
+                id: 'growth', label: '성장률',
+                hint: '연 성장률(CAGR 등) 또는 성장 추세.',
+                detectPatterns: ['성장률', '성장세', 'CAGR', 'growth', '연평균'],
+            },
+            {
+                id: 'competitors', label: '경쟁사',
+                hint: '주요 플레이어와 각자의 포지션.',
+                detectPatterns: ['경쟁사', '경쟁 ?업체', '주요 ?(업체|기업|플레이어)', 'competitors?'],
+            },
+            {
+                id: 'pricing', label: '가격',
+                hint: '가격대·요금 구조.',
+                detectPatterns: ['가격', '요금', '단가', 'pricing', '원대', '달러'],
+            },
+            {
+                id: 'customer-needs', label: '고객 니즈',
+                hint: '고객 요구·페인 포인트.',
+                detectPatterns: ['니즈', '고객 ?(요구|수요)', '페인 ?포인트', 'needs', 'pain ?points?'],
+            },
+            {
+                id: 'trends', label: '트렌드',
+                hint: '시장 동향·변화 방향.',
+                detectPatterns: ['트렌드', '동향', '추세', 'trends?'],
+            },
+            {
+                id: 'sources', label: '출처',
+                hint: '핵심 수치·주장의 출처. 모델 일반 지식이면 그렇게 명시.',
+                detectPatterns: ['출처', '근거', 'source', '자료:', '참고'],
+            },
+        ],
+    },
+    {
+        id: 'schedule',
+        label: '일정 관리',
+        detectKeywords: ['일정 ?(등록|추가|확인|조회|정리|관리)', '스케줄', '캘린더', '미팅 ?잡', '약속 ?(등록|추가|잡)'],
+        coverageCheck: false, // 짧은 확인형 응답이 정상 — footer 검사는 노이즈
+        elements: [
+            {
+                id: 'datetime', label: '일시',
+                hint: '날짜와 시간을 명시. 모호하면 되묻기.',
+                detectPatterns: ['\\d{1,2}\\s*[:시]', '날짜', '일시'],
+            },
+            {
+                id: 'title', label: '일정 제목',
+                hint: '무엇을 위한 일정인지.',
+                detectPatterns: ['제목', '일정명', '건명'],
+            },
+            {
+                id: 'conflict-check', label: '충돌 확인',
+                hint: '기존 일정과 겹침 여부 확인 결과 명시.',
+                detectPatterns: ['충돌', '겹치', '겹침'],
+            },
+        ],
+    },
+    {
+        id: 'work-research',
+        label: '업무조사',
+        detectKeywords: ['업무 ?조사', '조사해', '리서치', '알아봐\\s*줘?', '서치해', 'research'],
+        coverageCheck: true,
+        elements: [
+            {
+                id: 'purpose', label: '조사 목적',
+                hint: '무엇을 알기 위한 조사인지 한 줄 명시.',
+                detectPatterns: ['목적', '배경', '알아보기 위해'],
+            },
+            {
+                id: 'summary', label: '핵심 요약',
+                hint: '결론 먼저 — 3줄 이내 요약.',
+                detectPatterns: ['요약', '핵심', '결론부터', 'TL;?DR', 'summary'],
+            },
+            {
+                id: 'details', label: '세부 내용',
+                hint: '요약을 뒷받침하는 상세 조사 내용.',
+                detectPatterns: ['상세', '세부', '구체적', '자세히'],
+            },
+            {
+                id: 'sources', label: '출처',
+                hint: '핵심 주장의 출처. 모델 일반 지식이면 그렇게 명시.',
+                detectPatterns: ['출처', '근거', 'source', '참고'],
+            },
+            {
+                id: 'implications', label: '시사점·다음 단계',
+                hint: '조사 결과가 의미하는 것과 권장 다음 행동.',
+                detectPatterns: ['시사점', '다음 ?단계', '권장', '제안', '결론'],
+            },
+        ],
+    },
+];
+
+function toRegex(sources: string[]): RegExp {
+    return new RegExp(sources.join('|'), 'iu');
+}
+
+/**
+ * 사용자 요청에서 업무 유형 감지. 배열 순서대로 첫 매치 반환, 없으면 null.
+ * 짧은 인사·일반 잡담은 키워드 미매치로 자연스럽게 제외.
+ */
+export function detectTaskType(
+    userPrompt: string,
+    requirements: TaskRequirement[] = DEFAULT_TASK_REQUIREMENTS,
+): TaskRequirement | null {
+    if (!userPrompt || !userPrompt.trim()) return null;
+    for (const req of requirements) {
+        if (toRegex(req.detectKeywords).test(userPrompt)) return req;
+    }
+    return null;
+}
+
+/**
+ * [TASK REQUIREMENTS] 시스템 프롬프트 블록 생성. 업무 유형 미감지 시 빈 문자열 —
+ * memoryContext 의 dynamicBlocks join 에서 자동 제외.
+ */
+export function buildRequirementGraphBlock(
+    userPrompt: string,
+    requirements: TaskRequirement[] = DEFAULT_TASK_REQUIREMENTS,
+    /** 과거 자주 누락된 요소 label — Reflection/Failure Pattern 이 공급 (T5: 같은 실수 반복 방지). */
+    emphasizeLabels: string[] = [],
+): string {
+    const req = detectTaskType(userPrompt, requirements);
+    if (!req) return '';
+
+    const emphasize = new Set(emphasizeLabels);
+    const lines: string[] = [];
+    lines.push(`[TASK REQUIREMENTS — ${req.label}]`);
+    lines.push(`이 요청은 '${req.label}' 업무로 감지됨. 아래 필수 요소를 *모두* 포함해 작성할 것.`);
+    lines.push('정보가 없어 채울 수 없는 요소는 조용히 생략하지 말고 "(확인 필요)" 로 명시 후 사용자에게 질문.');
+    lines.push('');
+    for (const el of req.elements) {
+        const mark = emphasize.has(el.label) ? ' ⚠️ *과거에 자주 누락된 요소 — 특히 주의*' : '';
+        lines.push(`- [ ] **${el.label}** — ${el.hint}${mark}`);
+    }
+    lines.push('');
+    lines.push('제출 전 위 체크리스트를 스스로 점검하고, 누락 요소가 있으면 보완 후 답변할 것.');
+    lines.push('[/TASK REQUIREMENTS]');
+    return lines.join('\n');
+}
+
+/**
+ * 답변 커버리지 결정론적 검사 — 각 필수 요소의 detectPatterns 가 답변에 하나도 안 나타나면
+ * missing. LLM 호출 없음 (정규식), 매 turn 안전.
+ *
+ * 한계(의도된 보수성): 패턴 매치 = "요소가 언급됨" 이지 "내용이 충실함" 이 아님.
+ * 내용 충실도 평가는 Phase 3 Self Evaluation 담당.
+ */
+export function checkRequirementCoverage(
+    userPrompt: string,
+    assistantAnswer: string,
+    requirements: TaskRequirement[] = DEFAULT_TASK_REQUIREMENTS,
+): CoverageResult {
+    const req = detectTaskType(userPrompt, requirements);
+    if (!req || !req.coverageCheck || !assistantAnswer || !assistantAnswer.trim()) {
+        return { ran: false, covered: [], missing: [] };
+    }
+    const covered: string[] = [];
+    const missing: string[] = [];
+    for (const el of req.elements) {
+        if (toRegex(el.detectPatterns).test(assistantAnswer)) covered.push(el.label);
+        else missing.push(el.label);
+    }
+    return { ran: true, taskId: req.id, taskLabel: req.label, covered, missing };
+}
+
+/**
+ * 커버리지 footer — 누락 있을 때만 문자열 반환 (전부 충족 시 빈 문자열, 노이즈 방지).
+ * termValidator footer 와 같은 위치(답변 아래 streamChunk)에 표시.
+ */
+export function formatRequirementCoverageFooter(result: CoverageResult): string {
+    if (!result.ran || result.missing.length === 0) return '';
+    const miss = result.missing.join(', ');
+    return `\n\n> ⚠️ **Requirement Check (${result.taskLabel})** — 누락 가능 요소: ${miss}. 해당 내용이 없었다면 "(확인 필요)" 로 표시하거나 추가 정보를 요청하세요.`;
+}
@@ -0,0 +1,153 @@
+/**
+ * Research Agent — 학습 큐 approved 항목의 조사 실행 (설계서 9장).
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 6 / Track 7-1. 학습 루프의 마지막 구간:
+ *   Need Engine → Learning Queue(approved) → **Research Agent** → Validation 게이트 → 저장 제안
+ *
+ * v1 은 "조사 패키지 준비자" — 로컬 환경의 정직한 한계 안에서 동작한다:
+ *   1. 조사 브리프 생성 (LLM 1회): 핵심 질문·검색 키워드·권장 출처 유형
+ *   2. 내부 지식 현황: 두뇌 검색 상위 문서 수집 (이미 아는 것 파악)
+ *   3. 모델 지식 초안: *모든 내용에 추정 라벨* — 출처 없는 지식이므로
+ *   4. Knowledge Validation 게이트: 출처 없음 → 대부분 review (자동 저장 안 됨)
+ *   5. 다음 단계 안내: /research·/benchmark (Datacollect Bridge) 로 외부 근거 수집 후 승인
+ *
+ * 산출물은 <brain>/.astra/growth/proposals/<id>.md — 사람이 검토·보강·승인하는
+ * 제안 문서다. 두뇌 본문에 자동 저장하지 않는다 (Permission Based Learning).
+ */
+
+import type { QueueItem } from './learningQueue';
+import { validateKnowledgeCandidate, ExistingKnowledgeRef, ValidationResult } from './knowledgeValidation';
+
+export interface ResearchBrief {
+    questions: string[];
+    keywords: string[];
+    sourceTypes: string[];
+}
+
+export interface ResearchPackage {
+    queueItemId: string;
+    topic: string;
+    brief: ResearchBrief;
+    /** 내부(두뇌) 관련 문서 — 이미 아는 것. */
+    internalRefs: ExistingKnowledgeRef[];
+    /** 모델 지식 초안 (추정 라벨 포함). */
+    draft: string;
+    validation: ValidationResult;
+}
+
+export type ResearchLlmCall = (system: string, user: string, maxTokens: number) => Promise<string>;
+
+/** 브리프 JSON 파싱 — criticAgent 와 같은 균형 괄호 추출 전략의 단순판. */
+export function parseBrief(raw: string): ResearchBrief | null {
+    const start = raw.indexOf('{');
+    const end = raw.lastIndexOf('}');
+    if (start === -1 || end <= start) return null;
+    try {
+        const obj = JSON.parse(raw.slice(start, end + 1));
+        const arr = (v: any) => Array.isArray(v) ? v.filter((x) => typeof x === 'string').slice(0, 8) : [];
+        const brief: ResearchBrief = { questions: arr(obj.questions), keywords: arr(obj.keywords), sourceTypes: arr(obj.sourceTypes) };
+        return brief.questions.length > 0 ? brief : null;
+    } catch {
+        return null;
+    }
+}
+
+/** LLM 실패 시에도 루프가 멈추지 않도록 — 주제 기반 최소 브리프. */
+export function fallbackBrief(topic: string): ResearchBrief {
+    return {
+        questions: [`${topic} 의 핵심 개념과 현재 표준은 무엇인가?`, `${topic} 에서 자주 발생하는 실수와 베스트 프랙티스는?`],
+        keywords: [topic],
+        sourceTypes: ['공식 문서', '최근 1년 내 자료'],
+    };
+}
+
+export async function runResearch(params: {
+    item: QueueItem;
+    /** 두뇌에서 주제 관련 기존 문서를 가져오는 함수 (orchestrator 주입). */
+    fetchInternalRefs: (topic: string) => Promise<ExistingKnowledgeRef[]>;
+    callLlm: ResearchLlmCall;
+    nowIso: string;
+}): Promise<ResearchPackage> {
+    const { item } = params;
+
+    // ─── 1. 조사 브리프 (LLM — 계획 수립은 환각 위험이 낮은 용도) ───
+    const briefSystem = [
+        '너는 조사 계획 수립자다. 주어진 학습 주제에 대한 조사 브리프를 만든다.',
+        '반드시 아래 JSON 만 출력:',
+        '{"questions": ["핵심 질문 3~5개"], "keywords": ["검색 키워드 3~6개"], "sourceTypes": ["권장 출처 유형 2~4개"]}',
+    ].join('\n');
+    let brief: ResearchBrief;
+    try {
+        const raw = await params.callLlm(briefSystem, `학습 주제: ${item.topic}\n선정 사유: ${item.reason}`, 400);
+        brief = parseBrief(raw) ?? fallbackBrief(item.topic);
+    } catch {
+        brief = fallbackBrief(item.topic);
+    }
+
+    // ─── 2. 내부 지식 현황 ───
+    let internalRefs: ExistingKnowledgeRef[] = [];
+    try {
+        internalRefs = await params.fetchInternalRefs(item.topic);
+    } catch { /* 검색 실패 → 빈 현황으로 진행 */ }
+
+    // ─── 3. 모델 지식 초안 — 전부 추정 라벨 강제 ───
+    const draftSystem = [
+        '너는 학습 노트 초안 작성자다. 주어진 질문들에 대해 아는 것을 정리한다.',
+        '중요: 너의 일반 지식은 출처가 없다. 모든 단락 끝에 "(모델 지식 — 추정, 출처 확인 필요)" 를 붙일 것.',
+        '모르는 것은 "모름 — 외부 조사 필요" 로 솔직히 표시. 지어내기 금지.',
+        '마크다운 ## 섹션으로 질문별 정리.',
+    ].join('\n');
+    let draft = '';
+    try {
+        draft = await params.callLlm(draftSystem, brief.questions.map((q, i) => `${i + 1}. ${q}`).join('\n'), 1200);
+    } catch {
+        draft = '(초안 생성 실패 — 외부 조사로 직접 작성 필요)';
+    }
+
+    // ─── 4. Validation 게이트 — 출처 없는 초안은 자동 수용되지 않는다 ───
+    const validation = validateKnowledgeCandidate(
+        { title: item.topic, content: draft, collectedAt: params.nowIso /* source 의도적 누락 */ },
+        internalRefs,
+    );
+
+    return { queueItemId: item.id, topic: item.topic, brief, internalRefs, draft, validation };
+}
+
+export function formatProposalMarkdown(pkg: ResearchPackage, meta: { dateStr: string; modelName: string }): string {
+    const lines: string[] = [];
+    lines.push(`# 학습 제안 — ${pkg.topic}`);
+    lines.push('');
+    lines.push(`- 생성: ${meta.dateStr} · 모델: ${meta.modelName} · 큐 항목: ${pkg.queueItemId}`);
+    lines.push(`- **검증 판정: ${pkg.validation.verdict}** — ${pkg.validation.reasons.join(' / ')}`);
+    lines.push('');
+    lines.push('## 1. 조사 브리프');
+    lines.push('');
+    lines.push('**핵심 질문**');
+    for (const q of pkg.brief.questions) lines.push(`- ${q}`);
+    lines.push('');
+    lines.push(`**검색 키워드**: ${pkg.brief.keywords.join(', ')}`);
+    lines.push(`**권장 출처**: ${pkg.brief.sourceTypes.join(', ')}`);
+    lines.push('');
+    lines.push('## 2. 내부 지식 현황 (두뇌에 이미 있는 것)');
+    lines.push('');
+    if (pkg.internalRefs.length === 0) {
+        lines.push('- 관련 문서 없음 — 완전한 신규 영역');
+    } else {
+        for (const ref of pkg.internalRefs) {
+            lines.push(`- \`${ref.filePath || ref.title}\``);
+        }
+    }
+    lines.push('');
+    lines.push('## 3. 모델 지식 초안 (출처 없음 — 검증 전 사용 금지)');
+    lines.push('');
+    lines.push(pkg.draft);
+    lines.push('');
+    lines.push('## 4. 다음 단계');
+    lines.push('');
+    lines.push('1. 위 키워드로 외부 근거 수집 — ASTRA 채팅에서 `/research` 또는 `/benchmark` (Datacollect Bridge 필요)');
+    lines.push('2. 수집 근거로 초안을 보강·교정 (추정 라벨 제거는 출처 확보 후에만)');
+    lines.push('3. 완성본을 두뇌 적절한 폴더에 저장하면 다음 turn 부터 검색에 반영됨');
+    lines.push('4. learning-queue.json 에서 이 항목 status 를 done 으로 변경');
+    lines.push('');
+    return lines.join('\n');
+}
@@ -0,0 +1,168 @@
+/**
+ * Skill Score + Success Pattern DB — 역량 점수와 성공 사례 축적 (설계서 12장).
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 6 / Track 7-3 + 7-4.
+ *
+ * Skill Score (0~100, 업무 유형 단위 v1):
+ *   최근 N건 reflection 의 (확신도 50% + 요소 충족률 30% + 비에스컬레이션율 20%).
+ *   전반기/후반기 비교로 추세(↑/→/↓)를 산출 — "SEO 52→81" 류 성장 표시의 원천.
+ *
+ * Success Pattern DB:
+ *   전 요소 충족 + 확신도 high 인 업무 turn 을 <brain>/.astra/growth/success-patterns.jsonl
+ *   에 적재. v1 은 기록·집계 (향후 증분: 신규 업무 turn 에 모범 사례로 주입).
+ */
+
+import * as fs from 'fs';
+import * as path from 'path';
+import type { ReflectionRecord } from './reflectionStore';
+
+export const SUCCESS_PATTERNS_REL_PATH = path.join('.astra', 'growth', 'success-patterns.jsonl');
+
+// ─────────────────────────── Skill Score ───────────────────────────
+
+export interface SkillScoreItem {
+    taskId: string;
+    taskLabel: string;
+    /** 0~100. */
+    score: number;
+    /** 전반기 대비 후반기 — 'up' | 'flat' | 'down'. 표본 4건 미만이면 'flat'. */
+    trend: 'up' | 'flat' | 'down';
+    /** 전반기/후반기 점수 (추세 근거). */
+    firstHalf: number;
+    secondHalf: number;
+    sampleCount: number;
+}
+
+function scoreOf(rs: ReflectionRecord[]): number {
+    if (rs.length === 0) return 0;
+    const avgConf = rs.reduce((s, r) => s + (r.confidenceScore || 0), 0) / rs.length;
+    const coverOk = rs.filter((r) => (r.missing || []).length === 0).length / rs.length;
+    const noEsc = rs.filter((r) => !r.escalated).length / rs.length;
+    return Math.round(avgConf * 0.5 + coverOk * 100 * 0.3 + noEsc * 100 * 0.2);
+}
+
+export function computeSkillScores(records: ReflectionRecord[]): SkillScoreItem[] {
+    const byTask = new Map<string, ReflectionRecord[]>();
+    for (const r of records) {
+        if (!r.taskId) continue;
+        const arr = byTask.get(r.taskId) || [];
+        arr.push(r);
+        byTask.set(r.taskId, arr);
+    }
+    const items: SkillScoreItem[] = [];
+    for (const [taskId, rs] of byTask) {
+        // ts 순 정렬 후 전/후반 비교.
+        const sorted = rs.slice().sort((a, b) => a.ts.localeCompare(b.ts));
+        const mid = Math.floor(sorted.length / 2);
+        const firstHalf = scoreOf(sorted.slice(0, mid));
+        const secondHalf = scoreOf(sorted.slice(mid));
+        let trend: SkillScoreItem['trend'] = 'flat';
+        if (sorted.length >= 4) {
+            if (secondHalf - firstHalf >= 5) trend = 'up';
+            else if (firstHalf - secondHalf >= 5) trend = 'down';
+        }
+        items.push({
+            taskId,
+            taskLabel: sorted[0].taskLabel || taskId,
+            score: scoreOf(sorted),
+            trend,
+            firstHalf,
+            secondHalf,
+            sampleCount: sorted.length,
+        });
+    }
+    return items.sort((a, b) => b.score - a.score);
+}
+
+export function formatSkillScoresMarkdown(items: SkillScoreItem[]): string {
+    const lines: string[] = [];
+    lines.push('## Skill Score (역량 점수)');
+    lines.push('');
+    lines.push('확신도 50% + 요소 충족률 30% + 비에스컬레이션율 20%. 추세는 전/후반기 비교 (표본 4건+).');
+    lines.push('');
+    if (items.length === 0) {
+        lines.push('- 데이터 없음');
+        return lines.join('\n');
+    }
+    const arrow = { up: '📈 상승', flat: '→ 유지', down: '📉 하락' } as const;
+    lines.push('| 업무 | Score | 추세 | 전반기→후반기 | 표본 |');
+    lines.push('|---|---|---|---|---|');
+    for (const it of items) {
+        lines.push(`| ${it.taskLabel} | **${it.score}** | ${arrow[it.trend]} | ${it.firstHalf}→${it.secondHalf} | ${it.sampleCount} |`);
+    }
+    return lines.join('\n');
+}
+
+// ─────────────────────── Success Pattern DB ───────────────────────
+
+export interface SuccessPattern {
+    ts: string;
+    taskId: string;
+    taskLabel: string;
+    confidenceScore: number;
+    promptPreview: string;
+    usedSources: string[];
+}
+
+/** 성공 판정 — 전 요소 충족 + 확신도 high(90+). */
+export function isSuccessTurn(record: ReflectionRecord): boolean {
+    return !!record.taskId
+        && (record.missing || []).length === 0
+        && record.confidenceScore >= 90;
+}
+
+export function appendSuccessPattern(brainPath: string, record: ReflectionRecord): boolean {
+    try {
+        if (!isSuccessTurn(record)) return false;
+        const file = path.join(brainPath, SUCCESS_PATTERNS_REL_PATH);
+        fs.mkdirSync(path.dirname(file), { recursive: true });
+        const pattern: SuccessPattern = {
+            ts: record.ts,
+            taskId: record.taskId!,
+            taskLabel: record.taskLabel || record.taskId!,
+            confidenceScore: record.confidenceScore,
+            promptPreview: record.promptPreview,
+            usedSources: record.usedSources || [],
+        };
+        fs.appendFileSync(file, JSON.stringify(pattern) + '\n', 'utf8');
+        return true;
+    } catch {
+        return false;
+    }
+}
+
+export function loadSuccessPatterns(brainPath: string, limit?: number): SuccessPattern[] {
+    try {
+        const file = path.join(brainPath, SUCCESS_PATTERNS_REL_PATH);
+        if (!fs.existsSync(file)) return [];
+        const lines = fs.readFileSync(file, 'utf8').split('\n').filter((l) => l.trim());
+        const out: SuccessPattern[] = [];
+        for (const line of lines) {
+            try {
+                const obj = JSON.parse(line);
+                if (obj && typeof obj.ts === 'string' && typeof obj.taskId === 'string') out.push(obj);
+            } catch { /* skip */ }
+        }
+        return limit && limit > 0 ? out.slice(-limit) : out;
+    } catch {
+        return [];
+    }
+}
+
+export function formatSuccessPatternsMarkdown(patterns: SuccessPattern[]): string {
+    const lines: string[] = [];
+    lines.push('## Success Patterns (성공 사례)');
+    lines.push('');
+    if (patterns.length === 0) {
+        lines.push('- 아직 없음 — 전 요소 충족 + 확신도 90+ 인 업무가 자동 축적됩니다.');
+        return lines.join('\n');
+    }
+    const byTask = new Map<string, number>();
+    for (const p of patterns) byTask.set(p.taskLabel, (byTask.get(p.taskLabel) || 0) + 1);
+    lines.push(`총 ${patterns.length}건 — ${Array.from(byTask.entries()).map(([l, c]) => `${l} ${c}건`).join(' · ')}`);
+    lines.push('');
+    for (const p of patterns.slice(-5).reverse()) {
+        lines.push(`- [${p.ts.slice(0, 10)}] ${p.taskLabel} (확신도 ${p.confidenceScore}) — "${p.promptPreview.slice(0, 60)}"`);
+    }
+    return lines.join('\n');
+}
@@ -0,0 +1,185 @@
+/**
+ * Task Eval Harness — 업무 산출물 골든셋 자동 채점 (Self Evaluation v1).
+ *
+ * Self-Evolving OS 마스터 플랜 Phase 3 / Track 3-4. "성장세를 숫자로 증명" 의 핵심:
+ * 같은 골든셋을 버전마다 돌려 점수 추이를 비교한다 (검색 평가 하니스가 recall@1
+ * 37.5%→75% 를 증명한 것과 같은 방법론을 업무 산출물에 적용).
+ *
+ * 골든셋: <brain>/.astra/eval/tasks/<task>.golden.jsonl
+ *   한 줄 = {"id","query","sourceFile","expectedElements":[label...],"reference","notes"}
+ *
+ * v1 채점은 결정론적 (LLM-judge 는 후속 증분):
+ *   - 요소 커버리지: expectedElements 의 detectPatterns 매치율 (requirementGraph 어휘 재사용)
+ *   - 정직성: "(확인 필요)" 류 표시 사용 여부 (지어내기 대신 모름 인정 — T1)
+ *   - 길이·구조: 섹션 헤딩 수
+ * LLM 호출(생성)은 주입(generate) — 하니스 자체는 순수, 테스트 가능.
+ */
+
+import * as fs from 'fs';
+import * as path from 'path';
+import { DEFAULT_TASK_REQUIREMENTS, TaskRequirement } from './requirementGraph';
+
+export const TASK_GOLDEN_DIR = path.join('.astra', 'eval', 'tasks');
+
+export interface TaskGoldenRecord {
+    id: string;
+    query: string;
+    sourceFile: string;
+    meetingTopic?: string;
+    expectedElements: string[];
+    reference: string;
+    notes?: string;
+}
+
+export interface TaskGoldenLoadResult {
+    records: TaskGoldenRecord[];
+    parseErrors: number;
+    sourcePath: string;
+}
+
+/** 골든셋 로드 — jsonl, `//` 주석·빈 줄 무시, 깨진 줄 카운트만. */
+export function loadTaskGoldenSet(brainPath: string, taskFileBase = 'meeting-minutes'): TaskGoldenLoadResult {
+    const sourcePath = path.join(brainPath, TASK_GOLDEN_DIR, `${taskFileBase}.golden.jsonl`);
+    const result: TaskGoldenLoadResult = { records: [], parseErrors: 0, sourcePath };
+    if (!fs.existsSync(sourcePath)) return result;
+    const lines = fs.readFileSync(sourcePath, 'utf8').split('\n');
+    for (const line of lines) {
+        const t = line.trim();
+        if (!t || t.startsWith('//')) continue;
+        try {
+            const obj = JSON.parse(t);
+            if (obj && typeof obj.id === 'string' && typeof obj.query === 'string' && Array.isArray(obj.expectedElements)) {
+                result.records.push(obj as TaskGoldenRecord);
+            } else {
+                result.parseErrors++;
+            }
+        } catch {
+            result.parseErrors++;
+        }
+    }
+    return result;
+}
+
+/** 요소 label → detectPatterns 매핑 (requirementGraph 정의 재사용, 못 찾으면 label 리터럴). */
+function patternsForLabel(label: string, requirements: TaskRequirement[]): RegExp {
+    for (const req of requirements) {
+        for (const el of req.elements) {
+            if (el.label === label) return new RegExp(el.detectPatterns.join('|'), 'iu');
+        }
+    }
+    // 정의에 없는 커스텀 요소 — label 자체를 리터럴 매치 (정규식 특수문자 escape).
+    return new RegExp(label.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), 'iu');
+}
+
+export interface TaskScore {
+    recordId: string;
+    /** 요소 커버리지 0~1. */
+    coverageRate: number;
+    covered: string[];
+    missing: string[];
+    /** "(확인 필요)" 류 정직성 마커 사용 수 (지어내기 방지 신호). */
+    honestyMarkers: number;
+    /** `##` 섹션 헤딩 수 (구조화 정도). */
+    sectionCount: number;
+    /** 출력 길이 (chars). */
+    answerLength: number;
+    /** 생성 실패 시 에러 메시지. */
+    error?: string;
+}
+
+export function scoreTaskAnswer(
+    answer: string,
+    record: TaskGoldenRecord,
+    requirements: TaskRequirement[] = DEFAULT_TASK_REQUIREMENTS,
+): TaskScore {
+    const covered: string[] = [];
+    const missing: string[] = [];
+    for (const label of record.expectedElements) {
+        if (patternsForLabel(label, requirements).test(answer)) covered.push(label);
+        else missing.push(label);
+    }
+    const honesty = answer.match(/\(확인 필요\)|\(담당자? 미정\)|\(기한 미정\)/g);
+    const sections = answer.match(/^#{1,3}\s+/gm);
+    return {
+        recordId: record.id,
+        coverageRate: record.expectedElements.length === 0 ? 1 : covered.length / record.expectedElements.length,
+        covered,
+        missing,
+        honestyMarkers: honesty ? honesty.length : 0,
+        sectionCount: sections ? sections.length : 0,
+        answerLength: answer.length,
+    };
+}
+
+export interface TaskEvalRunResult {
+    scores: TaskScore[];
+    avgCoverage: number;
+    perfectCount: number;
+}
+
+/**
+ * 골든셋 전체 평가 — 레코드별로 원자료를 읽어(readSource 주입) 생성(generate 주입)
+ * 후 채점. 한 레코드 실패가 전체를 막지 않음.
+ */
+export async function runTaskEval(params: {
+    records: TaskGoldenRecord[];
+    /** 원자료 파일 내용 로드 — 보통 fs.readFileSync, 테스트에선 fake. */
+    readSource: (sourceFile: string) => string;
+    /** 산출물 생성 — (query, sourceContent, expectedElements) → 답변. */
+    generate: (record: TaskGoldenRecord, sourceContent: string) => Promise<string>;
+    /** 원자료 최대 길이 (chars) — 로컬 모델 컨텍스트 보호. 기본 20000. */
+    maxSourceChars?: number;
+    onProgress?: (done: number, total: number) => void;
+}): Promise<TaskEvalRunResult> {
+    const maxChars = params.maxSourceChars ?? 20000;
+    const scores: TaskScore[] = [];
+    let done = 0;
+    for (const record of params.records) {
+        try {
+            let source = params.readSource(record.sourceFile);
+            if (source.length > maxChars) source = source.slice(0, maxChars) + '\n…(잘림)';
+            const answer = await params.generate(record, source);
+            scores.push(scoreTaskAnswer(answer, record));
+        } catch (e: any) {
+            scores.push({
+                recordId: record.id, coverageRate: 0, covered: [], missing: record.expectedElements.slice(),
+                honestyMarkers: 0, sectionCount: 0, answerLength: 0,
+                error: e?.message || String(e),
+            });
+        }
+        done++;
+        params.onProgress?.(done, params.records.length);
+    }
+    const valid = scores.filter((s) => !s.error);
+    const avgCoverage = valid.length === 0 ? 0 : valid.reduce((s, r) => s + r.coverageRate, 0) / valid.length;
+    return { scores, avgCoverage, perfectCount: valid.filter((s) => s.coverageRate === 1).length };
+}
+
+export function formatTaskEvalReport(
+    result: TaskEvalRunResult,
+    meta: { taskLabel: string; brainName: string; dateStr: string; modelName: string; notes?: string },
+): string {
+    const lines: string[] = [];
+    lines.push(`# 업무 평가 리포트 — ${meta.taskLabel}`);
+    lines.push('');
+    lines.push(`- 두뇌: ${meta.brainName}`);
+    lines.push(`- 일시: ${meta.dateStr}`);
+    lines.push(`- 모델: ${meta.modelName}`);
+    if (meta.notes) lines.push(`- 비고: ${meta.notes}`);
+    lines.push('');
+    lines.push(`## 요약 — 평균 요소 커버리지 **${(result.avgCoverage * 100).toFixed(1)}%** · 전 요소 충족 ${result.perfectCount}/${result.scores.length}건`);
+    lines.push('');
+    lines.push('| 레코드 | 커버리지 | 누락 요소 | 정직성 표시 | 섹션 수 | 길이 |');
+    lines.push('|---|---|---|---|---|---|');
+    for (const s of result.scores) {
+        if (s.error) {
+            lines.push(`| ${s.recordId} | — | (실패: ${s.error.slice(0, 60)}) | — | — | — |`);
+        } else {
+            lines.push(`| ${s.recordId} | ${(s.coverageRate * 100).toFixed(0)}% | ${s.missing.join(', ') || '없음'} | ${s.honestyMarkers} | ${s.sectionCount} | ${s.answerLength} |`);
+        }
+    }
+    lines.push('');
+    lines.push('> 같은 골든셋으로 버전마다 측정해 커버리지 추이를 비교하세요 — 이 숫자의 상승이 곧 성장세입니다.');
+    lines.push('');
+    return lines.join('\n');
+}
@@ -23,6 +23,11 @@ import { semanticRerank, DEFAULT_SEMANTIC_RERANK_OPTIONS } from '../../retrieval
 import { detectAmbiguity, buildIntentClarificationBlock, IntentStrictness } from '../../retrieval/intentClarification';
 import { buildCitationTraceBlock } from '../../retrieval/citationTrace';
 import { buildTerminologyBlock } from '../../retrieval/terminologyBlock';
+import { buildRequirementGraphBlock, detectTaskType } from '../../intelligence/requirementGraph';
+import { buildEpistemicGuardBlock } from '../../intelligence/epistemicGuardBlock';
+import { loadReflections, recurrentMisses } from '../../intelligence/reflectionStore';
+import { buildOrgMemoryBlock } from '../../intelligence/orgMemoryBlock';
+import type { RetrievalConfidenceSignals } from '../../intelligence/confidenceEngine';

 /**
 * 한 turn 의 RAG / 5-layer memory 컨텍스트 빌드.
@@ -73,6 +78,8 @@ export interface TurnContextSink {
    dynamicBlocks: Map<string, string>;
    /** Post-hoc Self-Check 용 — selected chunks 의 (title, excerpt) 요약. */
    selfCheckSources: Array<{ title: string; excerpt: string }>;
+    /** Confidence Engine 용 검색 신호 (Phase 2 / Track 1-1). memoryContext 가 채움. */
+    confidenceSignals: RetrievalConfidenceSignals | null;
 }

 export interface MemoryContextDeps {
@@ -281,12 +288,55 @@ export async function buildMemoryContext(deps: MemoryContextDeps): Promise<strin
    const blocks = deps.turnCtx.dynamicBlocks;

    // Intent Clarification — 답변보다 *역질문 우선*. 모호 아닐 때 빈 문자열 → join 시 자동 제외.
+    // ambiguity 결과는 Confidence Engine 신호로도 재사용 (아래 confidenceSignals).
+    const strict = (config.intentClarificationStrictness || 'medium') as IntentStrictness;
+    const ambig = detectAmbiguity(deps.currentPrompt, strict);
    if (config.intentClarificationEnabled !== false) {
-        const strict = (config.intentClarificationStrictness || 'medium') as IntentStrictness;
-        const ambig = detectAmbiguity(deps.currentPrompt, strict);
        blocks.set('intent-clarification', buildIntentClarificationBlock(ambig));
    }

+    // Confidence Engine 검색 신호 (Phase 2 / Track 1-1) — post-answer hook 이 확신도
+    // 산출에 사용. brain-trace 는 trace 표시용이라 제외.
+    const groundingChunks = result.selectedChunks.filter((c) => c.source !== 'brain-trace');
+    deps.turnCtx.confidenceSignals = {
+        chunkCount: groundingChunks.length,
+        topScore: groundingChunks.reduce((m, c) => Math.max(m, c.score), 0),
+        conflictCount: groundingChunks.filter(
+            (c) => c.metadata?.conflictSeverity && c.metadata.conflictSeverity !== 'NONE',
+        ).length,
+        ambiguityDetected: ambig.ambiguous === true,
+    };
+
+    // Epistemic Guard — 모름/추정/확실 3분류 강제. 검색 근거 없는 turn 일수록 강한 지시
+    // (근거 0건 + 업무 요청이면 원자료 역질문 우선). (Phase 2 / Track 1-3)
+    if (config.epistemicGuardEnabled !== false) {
+        blocks.set('epistemic-guard', buildEpistemicGuardBlock({
+            chunkCount: groundingChunks.length,
+            taskDetected: detectTaskType(deps.currentPrompt) !== null,
+        }));
+    }
+
+    // Requirement Graph — 업무 유형(회의록/시장조사/업무조사/일정) 감지 시 필수 요소
+    // 체크리스트 주입. 미감지 시 빈 문자열 → join 시 자동 제외. (Self-Evolving OS P1)
+    // Reflection 기록에서 *반복 누락 요소* 를 찾아 강조 — T5 "같은 실수 반복 금지" 루프.
+    if (config.requirementGraphEnabled !== false) {
+        let emphasize: string[] = [];
+        const detectedTask = detectTaskType(deps.currentPrompt);
+        if (detectedTask && config.reflectionEnabled !== false) {
+            try {
+                const recent = loadReflections(deps.activeBrain.localBrainPath, 200);
+                emphasize = recurrentMisses(recent, detectedTask.id, 3);
+            } catch { /* 회고 로드 실패가 turn 을 막지 않음 */ }
+        }
+        blocks.set('requirement-graph', buildRequirementGraphBlock(deps.currentPrompt, undefined, emphasize));
+    }
+
+    // Organizational Memory — <brain>/.astra/organization.md 의 조직 규칙·업무 방식을
+    // 항상 주입. 파일 없으면 no-op. (Self-Evolving OS P5 / Track 5-2)
+    if (config.orgMemoryEnabled !== false) {
+        blocks.set('org-memory', buildOrgMemoryBlock(deps.activeBrain.localBrainPath));
+    }
+
    // Terminology Dictionary — 사용자 편집 글로서리. 파일 없으면 빈 문자열.
    if (config.glossaryEnabled !== false) {
        blocks.set('terminology', buildTerminologyBlock({
@@ -20,6 +20,32 @@ import { RetrievalChunk } from './types';
 export interface CitationTraceOptions {
    /** 답변 끝 *출처 한 줄* 형식. 'tail' 만 v1 지원. */
    format: 'tail';
+    /**
+     * Provenance 표시 (Self-Evolving OS Phase 2 / Track 1-4) — 상위 출처의
+     * 최종 수정일·score 를 블록에 노출하고, 오래된 출처 사용 시 모델이 답변에
+     * 그 사실을 명시하게 지시. 기본 true.
+     */
+    provenanceEnabled: boolean;
+    /** 이 일수보다 오래된 출처는 "오래됨" 으로 분류. 기본 180일. */
+    staleAfterDays: number;
+    /** Provenance 에 나열할 상위 출처 수. 기본 5. */
+    provenanceTopCount: number;
+    /** 테스트 주입용 현재 시각 (epoch ms). 기본 Date.now(). */
+    nowMs?: number;
+}
+
+const DEFAULT_OPTIONS: CitationTraceOptions = {
+    format: 'tail',
+    provenanceEnabled: true,
+    staleAfterDays: 180,
+    provenanceTopCount: 5,
+};
+
+function fmtDate(epochMs: number): string {
+    const d = new Date(epochMs);
+    const mm = String(d.getMonth() + 1).padStart(2, '0');
+    const dd = String(d.getDate()).padStart(2, '0');
+    return `${d.getFullYear()}-${mm}-${dd}`;
 }

 /**
@@ -31,6 +57,7 @@ export function buildCitationTraceBlock(
    options: Partial<CitationTraceOptions> = {},
 ): string {
    if (!chunks || chunks.length === 0) return '';
+    const opts: CitationTraceOptions = { ...DEFAULT_OPTIONS, ...options };

    const lines: string[] = [];
    lines.push('[CITATION TRACE]');
@@ -44,6 +71,32 @@ export function buildCitationTraceBlock(
    lines.push('3. 일반 모델 지식만 사용했다면: *출처: 모델 지식 (검색 출처 미사용)*');
    lines.push('4. 답변이 검증 가능하도록 — 사용자가 그 파일을 열면 답변 근거를 확인할 수 있어야.');
    lines.push('5. *출처:* 라인은 답변 *맨 끝* 한 번만 — 본문 중간에 흩어 놓지 말 것.');
+
+    // ─── Provenance — 출처 신선도·신뢰도 메타데이터 (Track 1-4) ───
+    // 목적: "어떤 지식 때문에 이 결론이 나왔는가" 역추적 + 오래된 지식 기반 답변 표시.
+    if (opts.provenanceEnabled) {
+        const now = opts.nowMs ?? Date.now();
+        const staleMs = opts.staleAfterDays * 24 * 60 * 60 * 1000;
+        const top = chunks
+            .filter((c) => c.source !== 'brain-trace')
+            .sort((a, b) => b.score - a.score)
+            .slice(0, opts.provenanceTopCount);
+        const withMeta = top.filter((c) => typeof c.metadata?.lastUpdated === 'number');
+        if (withMeta.length > 0) {
+            lines.push('');
+            lines.push('[출처 메타데이터 — Provenance]');
+            for (const c of withMeta) {
+                const updated = c.metadata.lastUpdated as number;
+                const isStale = now - updated > staleMs;
+                const staleTag = isStale ? ` ⚠️오래됨(${opts.staleAfterDays}일+)` : '';
+                lines.push(`- \`${c.title || '(제목 없음)'}\` — 수정일 ${fmtDate(updated)}, score ${c.score.toFixed(2)}${staleTag}`);
+            }
+            if (withMeta.some((c) => now - (c.metadata.lastUpdated as number) > staleMs)) {
+                lines.push('⚠️오래됨 출처를 핵심 근거로 사용하면 답변에 "출처가 오래되어 현재와 다를 수 있음" 을 명시할 것.');
+            }
+        }
+    }
+
    lines.push('[/CITATION TRACE]');
    return lines.join('\n');
 }
@@ -0,0 +1,175 @@
+/**
+ * Confidence Engine + Escalation Engine (Self-Evolving OS Phase 2) 단위 테스트.
+ * 순수 함수만 검증 — vscode 의존 없음.
+ */
+import {
+    extractAnswerSignals,
+    computeConfidence,
+    formatConfidenceFooter,
+    toBand,
+    RetrievalConfidenceSignals,
+} from '../src/intelligence/confidenceEngine';
+import { decideEscalation, formatEscalationFooter } from '../src/intelligence/escalationEngine';
+import { buildEpistemicGuardBlock } from '../src/intelligence/epistemicGuardBlock';
+import { buildCitationTraceBlock } from '../src/retrieval/citationTrace';
+import type { RetrievalChunk } from '../src/retrieval/types';
+
+const strongRetrieval: RetrievalConfidenceSignals = {
+    chunkCount: 5, topScore: 0.82, conflictCount: 0, ambiguityDetected: false,
+};
+const noRetrieval: RetrievalConfidenceSignals = {
+    chunkCount: 0, topScore: 0, conflictCount: 0, ambiguityDetected: false,
+};
+
+describe('extractAnswerSignals', () => {
+    it('헤지 마커와 출처 인용을 추출한다', () => {
+        const s = extractAnswerSignals('시장 규모는 5조원으로 추정됩니다. (확인 필요)\n\n*출처:* `시장조사.md`', 0);
+        expect(s.hedgeCount).toBe(2);
+        expect(s.hasCitation).toBe(true);
+        expect(s.modelKnowledgeOnly).toBe(false);
+    });
+
+    it('모델 지식만 사용 표기를 구분한다', () => {
+        const s = extractAnswerSignals('일반적인 설명입니다.\n\n*출처: 모델 지식 (검색 출처 미사용)*', null);
+        expect(s.hasCitation).toBe(false);
+        expect(s.modelKnowledgeOnly).toBe(true);
+    });
+});
+
+describe('computeConfidence', () => {
+    it('강한 그라운딩 + 출처 인용 + 커버리지 충족 → 높음(90+)', () => {
+        const r = computeConfidence(strongRetrieval, {
+            hedgeCount: 0, hasCitation: true, modelKnowledgeOnly: false, coverageMissing: 0,
+        });
+        expect(r.score).toBeGreaterThanOrEqual(90);
+        expect(r.band).toBe('high');
+    });
+
+    it('근거 없음 + 모델 지식만 → 매우 낮음(<50)', () => {
+        const r = computeConfidence(noRetrieval, {
+            hedgeCount: 2, hasCitation: false, modelKnowledgeOnly: true, coverageMissing: null,
+        });
+        expect(r.score).toBeLessThan(50);
+        expect(r.band).toBe('very-low');
+    });
+
+    it('충돌·모호성·커버리지 누락이 점수를 깎는다', () => {
+        const clean = computeConfidence(strongRetrieval, {
+            hedgeCount: 0, hasCitation: true, modelKnowledgeOnly: false, coverageMissing: 0,
+        });
+        const dirty = computeConfidence(
+            { ...strongRetrieval, conflictCount: 2, ambiguityDetected: true },
+            { hedgeCount: 0, hasCitation: true, modelKnowledgeOnly: false, coverageMissing: 3 },
+        );
+        expect(dirty.score).toBeLessThan(clean.score);
+        expect(dirty.factors.some((f) => f.label.includes('충돌'))).toBe(true);
+    });
+
+    it('점수는 0~100 으로 clamp 된다', () => {
+        const r = computeConfidence(noRetrieval, {
+            hedgeCount: 99, hasCitation: false, modelKnowledgeOnly: true, coverageMissing: 99,
+        });
+        expect(r.score).toBeGreaterThanOrEqual(0);
+        expect(r.score).toBeLessThanOrEqual(100);
+    });
+
+    it('구간 경계 — 90/70/50', () => {
+        expect(toBand(90)).toBe('high');
+        expect(toBand(89)).toBe('medium');
+        expect(toBand(70)).toBe('medium');
+        expect(toBand(69)).toBe('low');
+        expect(toBand(50)).toBe('low');
+        expect(toBand(49)).toBe('very-low');
+    });
+});
+
+describe('decideEscalation', () => {
+    const coverageOk = { ran: true, taskId: 'meeting-minutes', taskLabel: '회의록', covered: ['참석자'], missing: [] as string[] };
+    const noTask = { ran: false, covered: [] as string[], missing: [] as string[] };
+
+    function conf(score: number) {
+        return { score, band: toBand(score), bandLabel: '', factors: [] };
+    }
+
+    it('확신도 <50 이면 무조건 에스컬레이션', () => {
+        const d = decideEscalation({ confidence: conf(40), coverage: noTask, conflictCount: 0 });
+        expect(d.escalate).toBe(true);
+        expect(d.reasons[0]).toContain('매우 낮음');
+    });
+
+    it('고영향 업무(회의록) + 확신도 <70 → 검토 권장', () => {
+        const d = decideEscalation({ confidence: conf(60), coverage: coverageOk, conflictCount: 0 });
+        expect(d.escalate).toBe(true);
+        expect(d.reasons.some((r) => r.includes('회의록'))).toBe(true);
+    });
+
+    it('시장조사에서 출처 누락 → 단독 에스컬레이션', () => {
+        const d = decideEscalation({
+            confidence: conf(85),
+            coverage: { ran: true, taskId: 'market-research', taskLabel: '시장조사', covered: [], missing: ['출처'] },
+            conflictCount: 0,
+        });
+        expect(d.escalate).toBe(true);
+        expect(d.reasons.some((r) => r.includes('출처'))).toBe(true);
+    });
+
+    it('출처 충돌 + 확신도 <90 → 에스컬레이션', () => {
+        const d = decideEscalation({ confidence: conf(80), coverage: noTask, conflictCount: 1 });
+        expect(d.escalate).toBe(true);
+    });
+
+    it('확신도 높음 + 충돌 없음 + 커버리지 충족 → 에스컬레이션 없음', () => {
+        const d = decideEscalation({ confidence: conf(95), coverage: coverageOk, conflictCount: 0 });
+        expect(d.escalate).toBe(false);
+        expect(formatEscalationFooter(d)).toBe('');
+    });
+});
+
+describe('formatConfidenceFooter', () => {
+    it('점수·구간·상위 요인을 표시한다', () => {
+        const r = computeConfidence(strongRetrieval, {
+            hedgeCount: 0, hasCitation: true, modelKnowledgeOnly: false, coverageMissing: 0,
+        });
+        const f = formatConfidenceFooter(r);
+        expect(f).toContain(`확신도 ${r.score}/100`);
+        expect(f).toContain('높음');
+    });
+});
+
+describe('buildEpistemicGuardBlock', () => {
+    it('근거 없는 업무 turn 에 역질문 우선 지시가 들어간다', () => {
+        const b = buildEpistemicGuardBlock({ chunkCount: 0, taskDetected: true });
+        expect(b).toContain('검색 근거가 없음');
+        expect(b).toContain('질문');
+    });
+
+    it('근거 있는 turn 은 3분류 규칙만', () => {
+        const b = buildEpistemicGuardBlock({ chunkCount: 4, taskDetected: false });
+        expect(b).toContain('확인 필요');
+        expect(b).not.toContain('검색 근거가 없음');
+    });
+});
+
+describe('citationTrace Provenance 확장', () => {
+    const mkChunk = (title: string, lastUpdated?: number): RetrievalChunk => ({
+        id: title, source: 'brain-memory' as any, title, content: 'body', score: 0.8, tokenEstimate: 1,
+        metadata: { lastUpdated },
+    });
+    const NOW = new Date('2026-06-11T00:00:00Z').getTime();
+
+    it('수정일 메타데이터가 있으면 Provenance 섹션 표시 + 오래된 출처 경고', () => {
+        const fresh = mkChunk('최근문서', NOW - 10 * 24 * 3600 * 1000);
+        const stale = mkChunk('옛문서', NOW - 400 * 24 * 3600 * 1000);
+        const b = buildCitationTraceBlock([fresh, stale], { nowMs: NOW });
+        expect(b).toContain('Provenance');
+        expect(b).toContain('최근문서');
+        expect(b).toContain('⚠️오래됨');
+        expect(b).toContain('현재와 다를 수 있음');
+    });
+
+    it('메타데이터 없으면 기존 블록과 동일 (Provenance 섹션 없음)', () => {
+        const b = buildCitationTraceBlock([mkChunk('문서')], { nowMs: NOW });
+        expect(b).toContain('[CITATION TRACE]');
+        expect(b).not.toContain('Provenance');
+    });
+});
@@ -0,0 +1,65 @@
+/**
+ * Schedule Conflict Check (Self-Evolving OS Track 6-2/6-3) 테스트.
+ */
+import { findScheduleConflicts, formatConflictReport, CachedCalEvent } from '../src/features/calendar/conflictCheck';
+import { _parseCalEventAttrs } from '../src/agent/attrParsers';
+
+// 로컬 ISO (timezone 없음) — 실제 캐시도 로컬 자정 기준 all-day 를 담으므로
+// 테스트를 실행 머신 timezone 과 무관하게 만든다.
+const EXISTING: CachedCalEvent[] = [
+    { summary: '주간회의', startIso: '2026-06-12T14:00:00', endIso: '2026-06-12T15:00:00', allDay: false, location: '회의실 A' },
+    { summary: '워크숍', startIso: '2026-06-13T00:00:00', allDay: true },
+];
+
+describe('findScheduleConflicts', () => {
+    it('구간이 겹치면 충돌', () => {
+        const c = findScheduleConflicts(EXISTING, { startIso: '2026-06-12T14:30:00', durationMinutes: 60 });
+        expect(c.length).toBe(1);
+        expect(c[0].summary).toBe('주간회의');
+    });
+
+    it('경계 접촉(끝=시작)은 충돌 아님', () => {
+        const c = findScheduleConflicts(EXISTING, { startIso: '2026-06-12T15:00:00', durationMinutes: 60 });
+        expect(c.length).toBe(0);
+    });
+
+    it('endIso 없으면 기본 60분으로 판정', () => {
+        const c = findScheduleConflicts(EXISTING, { startIso: '2026-06-12T13:30:00' });
+        expect(c.length).toBe(1); // 13:30~14:30 vs 14:00~15:00
+    });
+
+    it('종일 일정과 그 날짜의 시간 일정은 충돌', () => {
+        const c = findScheduleConflicts(EXISTING, { startIso: '2026-06-13T10:00:00', durationMinutes: 30 });
+        expect(c.some((e) => e.summary === '워크숍')).toBe(true);
+    });
+
+    it('잘못된 날짜 입력은 보수적으로 충돌 없음 (생성 단계에서 실패)', () => {
+        expect(findScheduleConflicts(EXISTING, { startIso: 'not-a-date' })).toEqual([]);
+        expect(findScheduleConflicts([{ summary: 'x', startIso: 'broken', allDay: false }], { startIso: '2026-06-12T14:00:00' })).toEqual([]);
+    });
+
+    it('빈 캐시면 충돌 없음', () => {
+        expect(findScheduleConflicts([], { startIso: '2026-06-12T14:00:00' })).toEqual([]);
+    });
+});
+
+describe('formatConflictReport', () => {
+    it('충돌 목록 + force 안내 포함', () => {
+        const msg = formatConflictReport([EXISTING[0]]);
+        expect(msg).toContain('주간회의');
+        expect(msg).toContain('force="true"');
+        expect(msg).toContain('보류');
+    });
+});
+
+describe('_parseCalEventAttrs force 속성', () => {
+    it('force="true" 파싱', () => {
+        const attrs = _parseCalEventAttrs(' title="미팅" start="2026-06-12T14:00" force="true"');
+        expect(attrs.force).toBe(true);
+    });
+
+    it('미지정이면 undefined (기본 차단 동작)', () => {
+        const attrs = _parseCalEventAttrs(' title="미팅" start="2026-06-12T14:00"');
+        expect(attrs.force).toBeUndefined();
+    });
+});
@@ -0,0 +1,200 @@
+/**
+ * Critic Agent / Reflection Store / Task Eval Harness (Self-Evolving OS P1 잔여 + P3) 테스트.
+ */
+import * as fs from 'fs';
+import * as os from 'os';
+import * as path from 'path';
+import {
+    buildCritiquePrompt,
+    parseCritique,
+    runCriticReview,
+    formatCriticFooter,
+} from '../src/intelligence/criticAgent';
+import {
+    appendReflection,
+    loadReflections,
+    summarizeFailurePatterns,
+    recurrentMisses,
+    formatGrowthReport,
+    ReflectionRecord,
+} from '../src/intelligence/reflectionStore';
+import {
+    loadTaskGoldenSet,
+    scoreTaskAnswer,
+    runTaskEval,
+    formatTaskEvalReport,
+    TASK_GOLDEN_DIR,
+    TaskGoldenRecord,
+} from '../src/intelligence/taskEvalHarness';
+import { DEFAULT_TASK_REQUIREMENTS, buildRequirementGraphBlock } from '../src/intelligence/requirementGraph';
+
+const MEETING_REQ = DEFAULT_TASK_REQUIREMENTS.find((r) => r.id === 'meeting-minutes')!;
+
+function tmpBrain(): string {
+    return fs.mkdtempSync(path.join(os.tmpdir(), 'astra-test-brain-'));
+}
+
+function mkReflection(partial: Partial<ReflectionRecord>): ReflectionRecord {
+    return {
+        ts: '2026-06-11T10:00:00.000Z',
+        taskId: 'meeting-minutes',
+        taskLabel: '회의록',
+        confidenceScore: 70,
+        confidenceBand: 'medium',
+        missing: [],
+        escalated: false,
+        criticIssues: null,
+        promptPreview: '회의록 정리',
+        ...partial,
+    };
+}
+
+describe('criticAgent', () => {
+    it('critique 프롬프트에 필수 요소와 누락 신호가 포함된다', () => {
+        const { system, user } = buildCritiquePrompt('회의록 정리해줘', '초안...', MEETING_REQ, ['담당자', '기한']);
+        expect(system).toContain('JSON');
+        expect(user).toContain('담당자, 기한');
+        expect(user).toContain('회의록');
+    });
+
+    it('코드펜스·잡설 섞인 응답에서도 JSON 을 파싱한다', () => {
+        const raw = '검토 결과입니다.\n```json\n{"pass": false, "issues": [{"severity": "major", "description": "기한 누락"}], "supplement": "## 기한\\n- (기한 미정)"}\n```';
+        const c = parseCritique(raw);
+        expect(c).not.toBeNull();
+        expect(c!.pass).toBe(false);
+        expect(c!.issues[0].severity).toBe('major');
+        expect(c!.supplement).toContain('기한');
+    });
+
+    it('pass=true 여도 issues 가 있으면 pass 취급하지 않는다', () => {
+        const c = parseCritique('{"pass": true, "issues": [{"severity": "minor", "description": "x"}], "supplement": ""}');
+        expect(c!.pass).toBe(false);
+    });
+
+    it('runCriticReview — LLM 실패 시 null (silent skip)', async () => {
+        const result = await runCriticReview({
+            userPrompt: 'q', draft: 'd', requirement: MEETING_REQ, missingLabels: [],
+            callLlm: async () => { throw new Error('LLM down'); },
+        });
+        expect(result).toBeNull();
+    });
+
+    it('formatCriticFooter — pass 면 빈 문자열, 실패면 이슈+보완 표시', () => {
+        expect(formatCriticFooter({ pass: true, issues: [], supplement: '' })).toBe('');
+        const f = formatCriticFooter({
+            pass: false,
+            issues: [{ severity: 'major', description: '결정과 미결이 섞임' }],
+            supplement: '## 보완',
+        });
+        expect(f).toContain('검수 (Critic)');
+        expect(f).toContain('결정과 미결이 섞임');
+        expect(f).toContain('보완 제안');
+    });
+});
+
+describe('reflectionStore', () => {
+    it('append → load 라운드트립', () => {
+        const brain = tmpBrain();
+        expect(appendReflection(brain, mkReflection({ missing: ['기한'] }))).toBe(true);
+        expect(appendReflection(brain, mkReflection({ missing: ['기한', '담당자'] }))).toBe(true);
+        const records = loadReflections(brain);
+        expect(records.length).toBe(2);
+        expect(records[1].missing).toEqual(['기한', '담당자']);
+    });
+
+    it('summarizeFailurePatterns — 반복 누락 집계 (많은 순)', () => {
+        const records = [
+            mkReflection({ missing: ['기한'] }),
+            mkReflection({ missing: ['기한'] }),
+            mkReflection({ missing: ['기한', '담당자'] }),
+        ];
+        const patterns = summarizeFailurePatterns(records);
+        expect(patterns[0]).toMatchObject({ element: '기한', count: 3 });
+        expect(patterns[1]).toMatchObject({ element: '담당자', count: 1 });
+    });
+
+    it('recurrentMisses — threshold 이상만 반환', () => {
+        const records = [
+            mkReflection({ missing: ['기한'] }),
+            mkReflection({ missing: ['기한'] }),
+            mkReflection({ missing: ['기한'] }),
+            mkReflection({ missing: ['담당자'] }),
+        ];
+        expect(recurrentMisses(records, 'meeting-minutes', 3)).toEqual(['기한']);
+        expect(recurrentMisses(records, 'market-research', 3)).toEqual([]);
+    });
+
+    it('반복 누락 요소가 Requirement Graph 블록에 강조된다 (T5 루프)', () => {
+        const block = buildRequirementGraphBlock('회의록 정리해줘', undefined, ['기한']);
+        expect(block).toContain('과거에 자주 누락된 요소');
+    });
+
+    it('formatGrowthReport — 주별 추이 테이블 + 반복 실수 Top', () => {
+        const records = [
+            mkReflection({ ts: '2026-06-01T10:00:00.000Z', confidenceScore: 60, missing: ['기한'] }),
+            mkReflection({ ts: '2026-06-09T10:00:00.000Z', confidenceScore: 85, missing: [] }),
+        ];
+        const md = formatGrowthReport(records);
+        expect(md).toContain('평균 확신도');
+        expect(md).toContain('기한');
+        expect(formatGrowthReport([])).toContain('기록 없음');
+    });
+});
+
+describe('taskEvalHarness', () => {
+    const record: TaskGoldenRecord = {
+        id: 'mm-test',
+        query: '이 회의 내용을 회의록으로 정리해줘',
+        sourceFile: 'fake.txt',
+        expectedElements: ['참석자', '결정사항', '액션 아이템', '담당자', '기한'],
+        reference: 'ref',
+    };
+
+    it('골든셋 로드 — 주석·깨진 줄 처리', () => {
+        const brain = tmpBrain();
+        const dir = path.join(brain, TASK_GOLDEN_DIR);
+        fs.mkdirSync(dir, { recursive: true });
+        fs.writeFileSync(path.join(dir, 'meeting-minutes.golden.jsonl'), [
+            '// 주석',
+            JSON.stringify(record),
+            '{broken',
+            '',
+        ].join('\n'), 'utf8');
+        const { records, parseErrors } = loadTaskGoldenSet(brain);
+        expect(records.length).toBe(1);
+        expect(parseErrors).toBe(1);
+        expect(records[0].id).toBe('mm-test');
+    });
+
+    it('scoreTaskAnswer — 커버리지·정직성·구조 채점', () => {
+        const answer = '# 회의록\n## 참석자: 김OO\n## 결정사항: A안\n## 액션 아이템\n- 발송 (담당자: 김OO, (기한 미정))';
+        const s = scoreTaskAnswer(answer, record);
+        expect(s.coverageRate).toBe(1);
+        expect(s.honestyMarkers).toBeGreaterThanOrEqual(1);
+        expect(s.sectionCount).toBeGreaterThanOrEqual(3);
+    });
+
+    it('runTaskEval — 생성 실패가 전체를 막지 않고 에러 레코드로 남는다', async () => {
+        const result = await runTaskEval({
+            records: [record, { ...record, id: 'mm-fail' }],
+            readSource: () => '전사 내용',
+            generate: async (r) => {
+                if (r.id === 'mm-fail') throw new Error('engine down');
+                return '## 참석자 a ## 결정사항 b ## 액션 아이템 c 담당자 d 기한 e';
+            },
+        });
+        expect(result.scores.length).toBe(2);
+        expect(result.scores[0].coverageRate).toBe(1);
+        expect(result.scores[1].error).toContain('engine down');
+        expect(result.avgCoverage).toBe(1); // 실패 레코드는 평균에서 제외
+    });
+
+    it('formatTaskEvalReport — 요약·테이블 포함', () => {
+        const md = formatTaskEvalReport(
+            { scores: [scoreTaskAnswer('참석자 결정사항', record)], avgCoverage: 0.4, perfectCount: 0 },
+            { taskLabel: '회의록', brainName: 'B', dateStr: 'now', modelName: 'gemma' },
+        );
+        expect(md).toContain('평균 요소 커버리지');
+        expect(md).toContain('mm-test');
+    });
+});
@@ -0,0 +1,174 @@
+/**
+ * Knowledge Validation / Belief Revision / Decay / Debt
+ * (Self-Evolving OS Phase 4 — 지식 운영) 테스트.
+ */
+import {
+    validateKnowledgeCandidate,
+    jaccardSimilarity,
+    ExistingKnowledgeRef,
+} from '../src/intelligence/knowledgeValidation';
+import {
+    auditKnowledgeDecay,
+    classifyDecayRule,
+    decayFactor,
+    formatDecayReport,
+} from '../src/intelligence/knowledgeDecay';
+import { computeKnowledgeDebt, formatNeedsMarkdown } from '../src/intelligence/needEngine';
+import type { ReflectionRecord } from '../src/intelligence/reflectionStore';
+
+const NOW = Date.parse('2026-06-11T00:00:00Z');
+const DAY = 86400000;
+
+describe('knowledgeValidation', () => {
+    const existing: ExistingKnowledgeRef[] = [
+        {
+            title: 'GA4 전환율 가이드',
+            content: 'GA4 전환율 계산은 전환수 나누기 세션수 기준이며 보고서는 탐색 분석에서 본다',
+            lastUpdated: NOW - 100 * DAY,
+        },
+    ];
+
+    it('출처 있고 중복/충돌 없는 신선한 후보 → accept', () => {
+        const r = validateKnowledgeCandidate(
+            { title: '쿠팡 SEO', content: '쿠팡 검색 알고리즘은 판매량과 리뷰 점수를 핵심 신호로 사용한다', source: 'https://example.com', collectedAt: '2026-06-10T00:00:00Z' },
+            existing, { nowMs: NOW },
+        );
+        expect(r.verdict).toBe('accept');
+        expect(r.beliefRevision).toBe('add');
+    });
+
+    it('출처 없으면 자동 수용 불가 (review)', () => {
+        const r = validateKnowledgeCandidate(
+            { title: 't', content: '완전히 새로운 내용의 지식 후보입니다 검증 테스트' },
+            existing, { nowMs: NOW },
+        );
+        expect(r.verdict).toBe('review');
+        expect(r.checks.hasSource).toBe(false);
+    });
+
+    it('거의 동일한 내용 → 중복 reject', () => {
+        const r = validateKnowledgeCandidate(
+            { title: 'GA4', content: 'GA4 전환율 계산은 전환수 나누기 세션수 기준이며 보고서는 탐색 분석에서 본다', source: 's', collectedAt: '2026-06-10T00:00:00Z' },
+            existing, { nowMs: NOW },
+        );
+        expect(r.verdict).toBe('reject');
+        expect(r.checks.duplicateOf).toBe('GA4 전환율 가이드');
+    });
+
+    it('관련/충돌 + 후보가 더 최신 → review + update 권고 (Belief Revision)', () => {
+        const r = validateKnowledgeCandidate(
+            { title: 'GA4 변경', content: 'GA4 전환율 계산은 이제 전환수 나누기 사용자수 기준으로 변경되었다 보고서 위치도 다르다', source: 's', collectedAt: '2026-06-01T00:00:00Z' },
+            existing, { nowMs: NOW },
+        );
+        expect(r.verdict).toBe('review');
+        expect(r.checks.conflictsWith).toBe('GA4 전환율 가이드');
+        expect(r.beliefRevision).toBe('update');
+    });
+
+    it('수집일이 1년 이상 경과 → stale review', () => {
+        const r = validateKnowledgeCandidate(
+            { title: 't', content: '전혀 다른 주제의 오래된 지식 항목', source: 's', collectedAt: '2024-01-01T00:00:00Z' },
+            existing, { nowMs: NOW },
+        );
+        expect(r.verdict).toBe('review');
+        expect(r.checks.freshness).toBe('stale');
+    });
+
+    it('jaccardSimilarity — 동일 1.0, 무관 ~0', () => {
+        expect(jaccardSimilarity('같은 문장 테스트', '같은 문장 테스트')).toBe(1);
+        expect(jaccardSimilarity('완전히 다른 내용', '전혀 무관한 주제')).toBe(0);
+    });
+});
+
+describe('knowledgeDecay', () => {
+    it('분야 분류 — AI 30일, SEO 90일, 기본 365일', () => {
+        expect(classifyDecayRule('Topics/RAG_청킹_전략.md').halfLifeDays).toBe(30);
+        expect(classifyDecayRule('Topics/네이버_SEO_가이드.md').halfLifeDays).toBe(90);
+        expect(classifyDecayRule('Topics/요리_레시피.md').halfLifeDays).toBe(365);
+    });
+
+    it('decayFactor — 반감기 경과 시 0.5', () => {
+        expect(decayFactor(NOW - 30 * DAY, 30, NOW)).toBeCloseTo(0.5, 2);
+        expect(decayFactor(NOW, 30, NOW)).toBe(1);
+    });
+
+    it('audit — stale 우선 정렬 + 상태 판정', () => {
+        const items = auditKnowledgeDecay([
+            { relPath: 'ai_guide.md', lastUpdated: NOW - 90 * DAY },   // AI 30일 반감 → 0.125 stale
+            { relPath: '요리.md', lastUpdated: NOW - 30 * DAY },        // 일반 365일 → ~0.94 active
+        ], { nowMs: NOW });
+        expect(items[0].relPath).toBe('ai_guide.md');
+        expect(items[0].status).toBe('stale');
+        expect(items[1].status).toBe('active');
+    });
+
+    it('formatDecayReport — 요약·권고 포함, 자동 삭제 없음 명시', () => {
+        const items = auditKnowledgeDecay([{ relPath: 'ai.md', lastUpdated: NOW - 200 * DAY }], { nowMs: NOW });
+        const md = formatDecayReport(items, { brainName: 'B', dateStr: 'now' });
+        expect(md).toContain('노후 1');
+        expect(md).toContain('자동 이동/삭제 없음');
+    });
+});
+
+describe('knowledgeDebt', () => {
+    function mk(partial: Partial<ReflectionRecord>): ReflectionRecord {
+        return {
+            ts: '2026-06-11T10:00:00.000Z', taskId: 'market-research', taskLabel: '시장조사',
+            confidenceScore: 50, confidenceBand: 'low', missing: [], escalated: false,
+            criticIssues: null, promptPreview: 'p', weakGrounding: true, gapSeverity: 'high',
+            ...partial,
+        };
+    }
+
+    it('근거 없는 수행 turn 을 업무별로 집계, debtScore 정렬', () => {
+        const debt = computeKnowledgeDebt([
+            mk({}), mk({}), mk({ gapSeverity: 'medium' }),
+            mk({ taskId: 'meeting-minutes', taskLabel: '회의록', gapSeverity: 'low' }),
+            mk({ taskId: 'meeting-minutes', taskLabel: '회의록', weakGrounding: false }), // 부채 아님
+        ]);
+        expect(debt[0].taskId).toBe('market-research');
+        expect(debt[0].blockedTurns).toBe(3);
+        expect(debt[0].impact).toBeGreaterThan(5);
+        expect(debt.find((d) => d.taskId === 'meeting-minutes')!.blockedTurns).toBe(1);
+    });
+
+    it('formatNeedsMarkdown 에 Debt 섹션 포함', () => {
+        const debt = computeKnowledgeDebt([mk({})]);
+        const md = formatNeedsMarkdown([], [], debt);
+        expect(md).toContain('Knowledge Debt');
+        expect(md).toContain('시장조사');
+    });
+});
+
+describe('orgMemoryBlock (P5)', () => {
+    const fsMod = require('fs');
+    const osMod = require('os');
+    const pathMod = require('path');
+    const { buildOrgMemoryBlock, ORG_MEMORY_REL_PATH } = require('../src/intelligence/orgMemoryBlock');
+
+    it('organization.md 가 있으면 블록 주입 + Human Override 명시', () => {
+        const brain = fsMod.mkdtempSync(pathMod.join(osMod.tmpdir(), 'astra-test-org-'));
+        const file = pathMod.join(brain, ORG_MEMORY_REL_PATH);
+        fsMod.mkdirSync(pathMod.dirname(file), { recursive: true });
+        fsMod.writeFileSync(file, '## 업무 방식\n- 속도 우선, 완벽주의 지양', 'utf8');
+        const block = buildOrgMemoryBlock(brain);
+        expect(block).toContain('[ORGANIZATIONAL MEMORY]');
+        expect(block).toContain('속도 우선');
+        expect(block).toContain('사용자 지시 우선');
+    });
+
+    it('파일 없으면 빈 문자열 (no-op)', () => {
+        const brain = fsMod.mkdtempSync(pathMod.join(osMod.tmpdir(), 'astra-test-org-'));
+        expect(buildOrgMemoryBlock(brain)).toBe('');
+    });
+
+    it('본문이 길면 cap + 잘림 안내', () => {
+        const brain = fsMod.mkdtempSync(pathMod.join(osMod.tmpdir(), 'astra-test-org-'));
+        const file = pathMod.join(brain, ORG_MEMORY_REL_PATH);
+        fsMod.mkdirSync(pathMod.dirname(file), { recursive: true });
+        fsMod.writeFileSync(file, 'x'.repeat(5000), 'utf8');
+        const block = buildOrgMemoryBlock(brain, { maxBodyLength: 1000 });
+        expect(block).toContain('잘림');
+        expect(block.length).toBeLessThan(2000);
+    });
+});
@@ -0,0 +1,159 @@
+/**
+ * Gap Detector / Need Engine / Knowledge Inventory / Learning Queue
+ * (Self-Evolving OS Phase 3 — 성장 루프 코어) 테스트.
+ */
+import * as fs from 'fs';
+import * as os from 'os';
+import * as path from 'path';
+import { detectGaps } from '../src/intelligence/gapDetector';
+import { computeNeeds, knowledgeInventory, formatNeedsMarkdown, NEED_WEIGHTS } from '../src/intelligence/needEngine';
+import {
+    loadQueue,
+    saveQueue,
+    mergeNeedsIntoQueue,
+    formatQueueMarkdown,
+    QueueItem,
+} from '../src/intelligence/learningQueue';
+import type { ReflectionRecord } from '../src/intelligence/reflectionStore';
+
+function mkReflection(partial: Partial<ReflectionRecord>): ReflectionRecord {
+    return {
+        ts: '2026-06-11T10:00:00.000Z',
+        taskId: 'meeting-minutes',
+        taskLabel: '회의록',
+        confidenceScore: 70,
+        confidenceBand: 'medium',
+        missing: [],
+        escalated: false,
+        criticIssues: null,
+        promptPreview: 'p',
+        retrieval: { chunkCount: 3, topScore: 0.6 },
+        weakGrounding: false,
+        ...partial,
+    };
+}
+
+describe('detectGaps', () => {
+    const okSignals = { chunkCount: 4, topScore: 0.7, conflictCount: 0, ambiguityDetected: false };
+    const noGrounding = { chunkCount: 0, topScore: 0, conflictCount: 0, ambiguityDetected: false };
+
+    it('누락 3개 이상 → high', () => {
+        const g = detectGaps({
+            coverage: { ran: true, taskId: 'meeting-minutes', taskLabel: '회의록', covered: [], missing: ['참석자', '담당자', '기한'] },
+            signals: okSignals, taskId: 'meeting-minutes',
+        });
+        expect(g.severity).toBe('high');
+        expect(g.summary).toContain('3개 누락');
+    });
+
+    it('근거 0건 단독 → low, 고영향 업무 + 누락이면 한 단계 상향', () => {
+        const clean = detectGaps({
+            coverage: { ran: false, covered: [], missing: [] },
+            signals: noGrounding, taskId: null,
+        });
+        expect(clean.severity).toBe('low');
+        expect(clean.weakGrounding).toBe(true);
+
+        const worse = detectGaps({
+            coverage: { ran: true, taskId: 'meeting-minutes', taskLabel: '회의록', covered: [], missing: ['기한'] },
+            signals: noGrounding, taskId: 'meeting-minutes',
+        });
+        expect(worse.severity).toBe('high'); // medium(누락1) + 고영향·근거없음 bump
+    });
+
+    it('갭 없으면 none', () => {
+        const g = detectGaps({
+            coverage: { ran: true, taskId: 'meeting-minutes', taskLabel: '회의록', covered: ['참석자'], missing: [] },
+            signals: okSignals, taskId: 'meeting-minutes',
+        });
+        expect(g.severity).toBe('none');
+        expect(g.summary).toBe('갭 없음');
+    });
+});
+
+describe('computeNeeds', () => {
+    it('약한 그라운딩·누락 많은 업무가 높은 점수를 받는다', () => {
+        const records: ReflectionRecord[] = [
+            // 회의록: 깨끗한 수행 3회
+            mkReflection({}), mkReflection({}), mkReflection({}),
+            // 시장조사: 근거 없음 + 누락 + 저확신 2회
+            mkReflection({ taskId: 'market-research', taskLabel: '시장조사', weakGrounding: true, missing: ['출처', '시장 규모'], confidenceScore: 40, retrieval: { chunkCount: 0, topScore: 0 } }),
+            mkReflection({ taskId: 'market-research', taskLabel: '시장조사', weakGrounding: true, missing: ['출처'], confidenceScore: 45, retrieval: { chunkCount: 0, topScore: 0 } }),
+        ];
+        const needs = computeNeeds(records);
+        expect(needs[0].taskId).toBe('market-research');
+        expect(needs[0].score).toBeGreaterThan(needs[1].score);
+        expect(needs[0].topMisses).toContain('출처');
+        expect(needs[0].reason).toContain('누락');
+    });
+
+    it('가중치 합이 1', () => {
+        const sum = Object.values(NEED_WEIGHTS).reduce((s, w) => s + w, 0);
+        expect(sum).toBeCloseTo(1.0);
+    });
+
+    it('기록 없으면 빈 배열 + md 안내', () => {
+        expect(computeNeeds([])).toEqual([]);
+        expect(formatNeedsMarkdown([], [])).toContain('기록 없음');
+    });
+});
+
+describe('knowledgeInventory', () => {
+    it('그라운딩 평균으로 보유/부족/없음 판정', () => {
+        const records: ReflectionRecord[] = [
+            mkReflection({ retrieval: { chunkCount: 5, topScore: 0.8 } }),
+            mkReflection({ taskId: 'market-research', taskLabel: '시장조사', retrieval: { chunkCount: 0, topScore: 0 } }),
+            mkReflection({ taskId: 'work-research', taskLabel: '업무조사', retrieval: { chunkCount: 1, topScore: 0.3 } }),
+        ];
+        const inv = knowledgeInventory(records);
+        const byId = new Map(inv.map((i) => [i.taskId, i.status]));
+        expect(byId.get('meeting-minutes')).toBe('sufficient');
+        expect(byId.get('market-research')).toBe('missing');
+        expect(byId.get('work-research')).toBe('partial');
+    });
+});
+
+describe('learningQueue', () => {
+    const needs = computeNeeds([
+        mkReflection({ taskId: 'market-research', taskLabel: '시장조사', weakGrounding: true, missing: ['출처'], confidenceScore: 40 }),
+    ]);
+
+    it('save → load 라운드트립 + 우선순위 정렬 저장', () => {
+        const brain = fs.mkdtempSync(path.join(os.tmpdir(), 'astra-test-queue-'));
+        const queue = mergeNeedsIntoQueue([], needs, '2026-06-11T00:00:00.000Z');
+        expect(saveQueue(brain, queue)).toBe(true);
+        const loaded = loadQueue(brain);
+        expect(loaded.length).toBe(1);
+        expect(loaded[0].status).toBe('proposed');
+        expect(loaded[0].topic).toContain('시장조사');
+    });
+
+    it('proposed 는 갱신되지만 approved 는 불변 (Permission Based Learning)', () => {
+        const approved: QueueItem = {
+            id: 'need-market-research', topic: '시장조사 역량 보강', priority: 10, reason: '이전',
+            status: 'approved', createdAt: 'a', updatedAt: 'a',
+        };
+        const merged = mergeNeedsIntoQueue([approved], needs, '2026-06-11T00:00:00.000Z');
+        expect(merged.length).toBe(1);
+        expect(merged[0].status).toBe('approved');
+        expect(merged[0].priority).toBe(10); // Need 점수로 덮어쓰지 않음
+        expect(merged[0].reason).toBe('이전');
+    });
+
+    it('새 주제는 proposed 로 추가된다', () => {
+        const other: QueueItem = {
+            id: 'need-schedule', topic: '일정', priority: 5, reason: 'r',
+            status: 'done', createdAt: 'a', updatedAt: 'a',
+        };
+        const merged = mergeNeedsIntoQueue([other], needs, 'now');
+        expect(merged.length).toBe(2);
+        expect(merged.find((q) => q.id === 'need-market-research')?.status).toBe('proposed');
+        expect(merged.find((q) => q.id === 'need-schedule')?.status).toBe('done'); // 불변
+    });
+
+    it('formatQueueMarkdown — 승인 안내 포함', () => {
+        const md = formatQueueMarkdown(mergeNeedsIntoQueue([], needs, 'now'));
+        expect(md).toContain('approved');
+        expect(md).toContain('시장조사');
+    });
+});
@@ -0,0 +1,126 @@
+/**
+ * Requirement Graph (Self-Evolving OS Phase 1 / Track 2-1) 단위 테스트.
+ * 순수 함수만 검증 — vscode 의존 없음.
+ */
+import {
+    DEFAULT_TASK_REQUIREMENTS,
+    detectTaskType,
+    buildRequirementGraphBlock,
+    checkRequirementCoverage,
+    formatRequirementCoverageFooter,
+} from '../src/intelligence/requirementGraph';
+
+describe('detectTaskType', () => {
+    it('회의록 요청을 감지한다', () => {
+        expect(detectTaskType('오늘 주간회의 내용 회의록으로 정리해줘')?.id).toBe('meeting-minutes');
+        expect(detectTaskType('어제 미팅 노트 만들어줘')?.id).toBe('meeting-minutes');
+    });
+
+    it('시장조사 요청을 감지한다', () => {
+        expect(detectTaskType('전기차 충전 인프라 시장조사 해줘')?.id).toBe('market-research');
+        expect(detectTaskType('국내 로봇청소기 시장 규모 분석 부탁해')?.id).toBe('market-research');
+    });
+
+    it('일정 관리 요청을 감지한다', () => {
+        expect(detectTaskType('내일 3시에 미팅 잡아줘')?.id).toBe('schedule');
+        expect(detectTaskType('이번 주 일정 확인해줘')?.id).toBe('schedule');
+    });
+
+    it('범용 조사 요청은 업무조사로 감지한다 (시장조사보다 후순위)', () => {
+        expect(detectTaskType('MCP 프로토콜에 대해 조사해줘')?.id).toBe('work-research');
+    });
+
+    it('일반 잡담·빈 입력은 null', () => {
+        expect(detectTaskType('안녕! 오늘 기분 어때?')).toBeNull();
+        expect(detectTaskType('')).toBeNull();
+        expect(detectTaskType('   ')).toBeNull();
+    });
+});
+
+describe('buildRequirementGraphBlock', () => {
+    it('회의록 블록에 필수 요소 5종이 모두 포함된다', () => {
+        const block = buildRequirementGraphBlock('회의록 정리해줘');
+        expect(block).toContain('[TASK REQUIREMENTS — 회의록]');
+        for (const label of ['참석자', '결정사항', '액션 아이템', '담당자', '기한']) {
+            expect(block).toContain(label);
+        }
+        expect(block).toContain('(확인 필요)'); // 조용한 생략 금지 지시
+        expect(block).toContain('[/TASK REQUIREMENTS]');
+    });
+
+    it('업무 유형 미감지 시 빈 문자열 (dynamicBlocks join 에서 자동 제외)', () => {
+        expect(buildRequirementGraphBlock('고마워!')).toBe('');
+    });
+});
+
+describe('checkRequirementCoverage', () => {
+    const fullMinutes = [
+        '# 주간회의 회의록',
+        '## 참석자: 김OO, 이OO',
+        '## 결정사항: A안 채택',
+        '## 액션 아이템',
+        '- 견적서 발송 (담당자: 김OO, 기한: 6월 20일까지)',
+    ].join('\n');
+
+    it('모든 요소가 있으면 missing 이 빈 배열', () => {
+        const r = checkRequirementCoverage('회의록 정리해줘', fullMinutes);
+        expect(r.ran).toBe(true);
+        expect(r.taskId).toBe('meeting-minutes');
+        expect(r.missing).toEqual([]);
+    });
+
+    it('담당자·기한 누락을 검출한다', () => {
+        const partial = '# 회의록\n참석자: 김OO\n결정사항: A안 채택\n액션 아이템: 견적서 발송';
+        const r = checkRequirementCoverage('회의록 정리해줘', partial);
+        expect(r.ran).toBe(true);
+        expect(r.missing).toContain('담당자');
+        expect(r.missing).toContain('기한');
+        expect(r.covered).toContain('참석자');
+    });
+
+    it('coverageCheck=false 업무(일정)는 검사하지 않는다', () => {
+        const r = checkRequirementCoverage('내일 3시 미팅 잡아줘', '등록했습니다.');
+        expect(r.ran).toBe(false);
+    });
+
+    it('업무 유형 미감지·빈 답변이면 ran=false', () => {
+        expect(checkRequirementCoverage('안녕', '반가워요').ran).toBe(false);
+        expect(checkRequirementCoverage('회의록 정리해줘', '   ').ran).toBe(false);
+    });
+});
+
+describe('formatRequirementCoverageFooter', () => {
+    it('누락이 있으면 footer 에 업무명과 누락 요소를 표시', () => {
+        const footer = formatRequirementCoverageFooter({
+            ran: true, taskId: 'meeting-minutes', taskLabel: '회의록',
+            covered: ['참석자'], missing: ['담당자', '기한'],
+        });
+        expect(footer).toContain('회의록');
+        expect(footer).toContain('담당자, 기한');
+    });
+
+    it('전부 충족 또는 미실행이면 빈 문자열 (노이즈 방지)', () => {
+        expect(formatRequirementCoverageFooter({ ran: true, covered: ['참석자'], missing: [] })).toBe('');
+        expect(formatRequirementCoverageFooter({ ran: false, covered: [], missing: [] })).toBe('');
+    });
+});
+
+describe('DEFAULT_TASK_REQUIREMENTS 무결성', () => {
+    it('모든 detectKeywords / detectPatterns 가 유효한 정규식이다', () => {
+        for (const req of DEFAULT_TASK_REQUIREMENTS) {
+            expect(() => new RegExp(req.detectKeywords.join('|'), 'iu')).not.toThrow();
+            for (const el of req.elements) {
+                expect(() => new RegExp(el.detectPatterns.join('|'), 'iu')).not.toThrow();
+            }
+        }
+    });
+
+    it('업무 ID 와 요소 ID 가 중복되지 않는다', () => {
+        const taskIds = DEFAULT_TASK_REQUIREMENTS.map((r) => r.id);
+        expect(new Set(taskIds).size).toBe(taskIds.length);
+        for (const req of DEFAULT_TASK_REQUIREMENTS) {
+            const ids = req.elements.map((e) => e.id);
+            expect(new Set(ids).size).toBe(ids.length);
+        }
+    });
+});
@@ -0,0 +1,122 @@
+/**
+ * Research Agent / Skill Score / Success Pattern DB (Self-Evolving OS Phase 6) 테스트.
+ */
+import * as fs from 'fs';
+import * as os from 'os';
+import * as path from 'path';
+import { parseBrief, fallbackBrief, runResearch, formatProposalMarkdown } from '../src/intelligence/researchAgent';
+import {
+    computeSkillScores,
+    formatSkillScoresMarkdown,
+    isSuccessTurn,
+    appendSuccessPattern,
+    loadSuccessPatterns,
+} from '../src/intelligence/skillScore';
+import type { QueueItem } from '../src/intelligence/learningQueue';
+import type { ReflectionRecord } from '../src/intelligence/reflectionStore';
+
+const ITEM: QueueItem = {
+    id: 'need-market-research', topic: '시장조사 역량 보강', priority: 60, reason: '근거 없는 수행 다수',
+    status: 'approved', createdAt: 'a', updatedAt: 'a',
+};
+
+function mk(partial: Partial<ReflectionRecord>): ReflectionRecord {
+    return {
+        ts: '2026-06-11T10:00:00.000Z', taskId: 'meeting-minutes', taskLabel: '회의록',
+        confidenceScore: 80, confidenceBand: 'medium', missing: [], escalated: false,
+        criticIssues: null, promptPreview: '회의록 정리해줘', usedSources: ['회의기록.md'],
+        ...partial,
+    };
+}
+
+describe('researchAgent', () => {
+    it('parseBrief — 잡설 섞인 JSON 파싱, 실패 시 fallback', () => {
+        const ok = parseBrief('계획: {"questions":["q1","q2"],"keywords":["k"],"sourceTypes":["공식 문서"]} 끝');
+        expect(ok!.questions).toEqual(['q1', 'q2']);
+        expect(parseBrief('JSON 없음')).toBeNull();
+        expect(fallbackBrief('주제').questions.length).toBeGreaterThan(0);
+    });
+
+    it('runResearch — 브리프→내부현황→초안→Validation 게이트 (출처 없음 = review)', async () => {
+        const calls: string[] = [];
+        const pkg = await runResearch({
+            item: ITEM,
+            fetchInternalRefs: async () => [{ title: '기존문서', content: '기존 시장조사 노트 내용', filePath: 'a.md' }],
+            callLlm: async (system) => {
+                calls.push(system.slice(0, 20));
+                if (system.includes('조사 계획')) {
+                    return '{"questions":["시장 규모 출처는?"],"keywords":["시장조사"],"sourceTypes":["공식 통계"]}';
+                }
+                return '## 시장 규모\n일반적으로 통계청 자료를 쓴다 (모델 지식 — 추정, 출처 확인 필요)';
+            },
+            nowIso: '2026-06-11T00:00:00.000Z',
+        });
+        expect(pkg.brief.questions[0]).toContain('시장 규모');
+        expect(pkg.internalRefs.length).toBe(1);
+        expect(pkg.draft).toContain('추정');
+        // 출처가 없으므로 자동 수용 불가 — Permission Based Learning 게이트.
+        expect(pkg.validation.verdict).toBe('review');
+        expect(pkg.validation.checks.hasSource).toBe(false);
+        expect(calls.length).toBe(2);
+    });
+
+    it('LLM 전부 실패해도 fallback 브리프로 패키지 생성', async () => {
+        const pkg = await runResearch({
+            item: ITEM,
+            fetchInternalRefs: async () => [],
+            callLlm: async () => { throw new Error('down'); },
+            nowIso: '2026-06-11T00:00:00.000Z',
+        });
+        expect(pkg.brief.questions.length).toBeGreaterThan(0);
+        expect(pkg.draft).toContain('실패');
+    });
+
+    it('formatProposalMarkdown — 판정·브리프·다음 단계 포함', async () => {
+        const pkg = await runResearch({
+            item: ITEM, fetchInternalRefs: async () => [],
+            callLlm: async () => '{"questions":["q"],"keywords":["k"],"sourceTypes":["s"]}',
+            nowIso: '2026-06-11T00:00:00.000Z',
+        });
+        const md = formatProposalMarkdown(pkg, { dateStr: 'now', modelName: 'gemma' });
+        expect(md).toContain('검증 판정: review');
+        expect(md).toContain('/research');
+        expect(md).toContain('done 으로 변경');
+    });
+});
+
+describe('skillScore', () => {
+    it('확신도·충족률·비에스컬레이션 가중 합산 + 추세', () => {
+        const records = [
+            mk({ ts: '2026-06-01T10:00:00Z', confidenceScore: 50, missing: ['기한'], escalated: true }),
+            mk({ ts: '2026-06-02T10:00:00Z', confidenceScore: 55, missing: ['기한'] }),
+            mk({ ts: '2026-06-08T10:00:00Z', confidenceScore: 90, missing: [] }),
+            mk({ ts: '2026-06-09T10:00:00Z', confidenceScore: 95, missing: [] }),
+        ];
+        const scores = computeSkillScores(records);
+        expect(scores.length).toBe(1);
+        expect(scores[0].trend).toBe('up');
+        expect(scores[0].secondHalf).toBeGreaterThan(scores[0].firstHalf);
+        const md = formatSkillScoresMarkdown(scores);
+        expect(md).toContain('상승');
+    });
+
+    it('표본 4건 미만이면 추세 flat', () => {
+        const scores = computeSkillScores([mk({}), mk({ confidenceScore: 20 })]);
+        expect(scores[0].trend).toBe('flat');
+    });
+
+    it('isSuccessTurn — 전 요소 충족 + 확신도 90+ 만', () => {
+        expect(isSuccessTurn(mk({ confidenceScore: 92, missing: [] }))).toBe(true);
+        expect(isSuccessTurn(mk({ confidenceScore: 92, missing: ['기한'] }))).toBe(false);
+        expect(isSuccessTurn(mk({ confidenceScore: 80, missing: [] }))).toBe(false);
+    });
+
+    it('append → load 성공 패턴 라운드트립 (성공 turn 만 저장)', () => {
+        const brain = fs.mkdtempSync(path.join(os.tmpdir(), 'astra-test-sp-'));
+        expect(appendSuccessPattern(brain, mk({ confidenceScore: 95, missing: [] }))).toBe(true);
+        expect(appendSuccessPattern(brain, mk({ confidenceScore: 50 }))).toBe(false);
+        const patterns = loadSuccessPatterns(brain);
+        expect(patterns.length).toBe(1);
+        expect(patterns[0].usedSources).toEqual(['회의기록.md']);
+    });
+});