connectai/docs/plans/alignment-self-learning-plan.md

# Alignment Self-Learning 개선 계획 (v2 — 적대적 리뷰 반영)

## v1 → v2 변경 (리뷰 지적 반영)
1. **[❌→해결] webview 렌더링**: 2-C를 "권장"이 아닌 필수 구현으로 격상. 삽입 위치 media/sidebar.js 의 openQuestions 렌더 블록(~1228) 직후. `c.answeredQuestions` 중 자가 조사 marker 필터 후 최대 3건 표시.
2. **[❌→해결] selfAnswerQuestions 시스템 프롬프트 전문 명시** (아래 2-A에 추가).
3. **[❌→해결] `companyAlignmentKnowledgeSave` package.json 명세 추가** (3-C에 키 이름·기본값·설명 명시).
4. **[⚠→해결] contract mutation → 비파괴적 변경**: `analysis.contract`를 직접 mutate하지 않고 spread로 새 객체 생성(`enrichedContract`). 이후 모든 경로(payload·_alignment.set·_runCompanyTurn)에 enriched를 전달.
5. **[⚠→해결] 자가 조사 marker 명확화**: `(자가 조사) ` → `(자가 조사로 두뇌에서 확인) ` — dispatcher의 LLM이 사용자 직접 답변과 구분 가능하도록 의미를 텍스트에 내장. formatContractForPrompt 수정 불필요.
6. **[⚠→확인완료] 파일 쓰기 권한**: 기존 lessons.ts(line 64, 82)가 이미 fs.writeFileSync로 brain 폴더에 직접 쓰고 있음 — extension host는 로컬 fs 권한 제약 없음. 위험 아님.
7. **[⚠→영향없음] alignment 'off' 모드**: 자가 조사는 `_runIntentAlignment` 내부에서만 작동 — alignment 자체가 안 돌면 자가 조사도 비활성. 추가 처리 불필요.

## 목표
1인 기업 모드의 Intent Alignment(요청 분석) 단계가:
1. 묻기 전에 **프로젝트 컨텍스트를 먼저 본다** (Phase 1)
2. 그래도 비는 항목은 **두뇌를 스스로 검색해 답을 찾는다** (Phase 2)
3. 정말 모르는 것만 사용자에게 묻고, **받은 답을 두뇌에 저장**해 다음부터 안 묻는다 (Phase 3)

## 확인된 코드 사실
- `analyzeIntent` 입력(`IntentAnalysisInput`)에 프로젝트/두뇌 정보가 전혀 없음 — [intentAlignment.ts:51-74]
- `_buildProjectArchitectureContext()`가 이미 존재, `.astra/project-context/architecture.md`를 16,000자 cap으로 포맷 — [sidebarProvider.ts:1781, architecturePayloads.ts:97]
- 분석기 모델은 `companyIntentClassifierModel || defaultModel` (작은 모델) — [sidebarProvider.ts:1942]
- 두뇌 검색 빌딩블록: `findBrainFiles`(utils, 5s cache) + `getBrainTokenIndex` + `tokenize/expandQuery/scoreTfIdfPreTokenized` + `extractBestExcerpt` — [retrieval/]
- `RetrievalOrchestrator`는 `MemoryManager` 의존이라 alignment에서 직접 쓰기 무거움 → 빌딩블록 직접 조합
- webview alignment 카드는 `openQuestions`만 렌더, `answeredQuestions`는 미표시 — [media/sidebar.js:1209]
- Lesson 시스템(kind: qa-finding)은 "실수 회피" 용도로 의미가 다름 → **일반 노트로 저장** (lesson frontmatter 사용 안 함)
- `analyzeIntent`는 throw하지 않는 패턴 (실패 시 low-conf fallback) — 신규 코드도 동일 패턴 준수
- alignment 흐름: `_runIntentAlignment` → `shouldAutoProceedAlignment` 판정 → auto-proceed 또는 카드 표시 → `_handleAlignmentAnswer`(답변 라운드) / `_proceedWithCurrentAlignment`(진행 버튼)

---

## Phase 1 — 프로젝트 컨텍스트 주입

### 1-A. `src/features/company/intentAlignment.ts`
- `IntentAnalysisInput`에 `projectContext?: string` 추가 (doc comment 포함)
- `_buildUserMessage`: projectContext 있으면 블록 추가:
  ```
  [프로젝트 컨텍스트 — 현재 워크스페이스에서 자동 수집]
  아래는 현재 열려 있는 프로젝트의 아키텍처 요약입니다. 여기서 직접 확인되는
  사실(프로젝트가 무엇인지·기술 스택·구조)은 이미 알려진 정보로 취급해 context
  슬롯에 반영하고, openQuestions에 다시 넣지 마세요.
  ---
  <내용>
  ---
  ```
- `SYSTEM_PROMPT`에 규칙 1줄 추가: "[프로젝트 컨텍스트] 블록이 있으면 그 내용으로 context를 채우고, 거기서 답이 확인되는 질문은 openQuestions에 만들지 마세요."

### 1-B. `src/sidebarProvider.ts` `_runIntentAlignment`
- 첫 라운드만(`!opts.previousContract`) — priorChatSummary와 동일 패턴:
  ```ts
  let projectContext: string | undefined;
  if (!opts.previousContract) {
      try {
          const arch = this._buildProjectArchitectureContext();
          if (arch) {
              projectContext = arch.length > 3000
                  ? arch.slice(0, 3000) + '\n…(이하 생략 — 전체는 architecture.md 참조)'
                  : arch;
          }
      } catch { /* alignment는 계속 */ }
  }
  ```
- `analyzeIntent` 입력에 `projectContext` 전달
- 주의: architecture detach(autoAttach=false) 사용자는 빈 문자열 → Phase 1 효과 없음 (의도된 동작)

---

## Phase 2 — 자가 조사(Self-Research) 패스

### 2-A. 신규 `src/features/company/alignmentResearch.ts`
```ts
export const SELF_RESEARCH_PREFIX = '(자가 조사) ';

export interface QuestionEvidence {
    question: string;
    excerpts: Array<{ title: string; relativePath: string; excerpt: string }>;
}

// 두뇌 TF-IDF 검색 — 질문별 top 2 파일에서 발췌. 전체 합계 4,000자 cap.
// brainPath 없음/빈 두뇌/에러 → 빈 excerpts (throw 금지)
export function gatherEvidenceForQuestions(
    brainPath: string,
    questions: string[],
): QuestionEvidence[]

// LLM 1회 호출 — 근거만으로 답할 수 있는 질문 판별.
// JSON: { answers: [{ question, status: 'answered'|'unanswered', answer }] }
// 4-stage 관용 파서 (intentAlignment 패턴 복제). 실패 시 전원 unanswered.
export async function selfAnswerQuestions(
    ai: IAIService,
    input: { userPrompt: string; evidence: QuestionEvidence[]; model?: string },
): Promise<Array<{ question: string; answered: boolean; answer: string }>>
```

### 2-B. `_runIntentAlignment` 통합 (순서 변경 포함)
analyzeIntent 직후, `shouldAutoProceedAlignment` 판정 **전에** 삽입:
```ts
const contract = analysis.contract;
const reachedLimit = opts.roundsAsked >= opts.roundsLimit;

// ── 자가 조사: 사용자에게 묻기 전에 두뇌에서 스스로 답 찾기 ──
if (cfg.companyAlignmentSelfResearch !== false
    && contract.openQuestions.length > 0 && !reachedLimit) {
    try {
        const brain = getActiveBrainProfile();
        const evidence = gatherEvidenceForQuestions(brain.localBrainPath, contract.openQuestions);
        if (evidence.some((e) => e.excerpts.length > 0)) {
            const answers = await selfAnswerQuestions(new AIService(), {
                userPrompt: opts.userPrompt, evidence,
                model: cfg.companyIntentClassifierModel || cfg.defaultModel,
            });
            const solved = answers.filter((a) => a.answered && a.answer.trim());
            if (solved.length > 0) {
                const solvedSet = new Set(solved.map((s) => s.question));
                contract.answeredQuestions.push(
                    ...solved.map((s) => ({ q: s.question, a: SELF_RESEARCH_PREFIX + s.answer })));
                contract.openQuestions = contract.openQuestions.filter((q) => !solvedSet.has(q));
                // pixelOffice 로그
            }
        }
    } catch { /* 실패해도 원래 질문 그대로 진행 */ }
}

if (shouldAutoProceedAlignment(...)) { ... } // 기존 흐름 — openQuestions가 비면 자연히 confirm/진행
```
- 라운드 카운트 소비 없음 (사용자 응답이 아니므로)
- latency: 질문 있을 때만 LLM 1회 추가

### 2-C. webview `media/sidebar.js` (line ~1209 부근)
- `answeredQuestions` 중 `(자가 조사)` prefix 항목이 있으면 카드에 소섹션 추가:
  "🔎 스스로 확인한 정보" + q/a 목록 (최대 3건, a는 150자 cap)

### 2-D. config
- `src/config.ts`: `companyAlignmentSelfResearch: boolean` (default true) — 기존 company* 키 패턴
- `package.json`: `g1nation.company.alignmentSelfResearch` boolean default true + 설명

---

## Phase 3 — 지식 요청 답변의 두뇌 저장 (학습 루프 완성)

### 3-A. `alignmentResearch.ts`에 추가
```ts
// 사용자가 직접 답한 Q/A만 (SELF_RESEARCH_PREFIX 제외, a.length >= 20)
// 저장: <brain>/Alignment Knowledge/YYYY-MM-DD <slug30>.md (일반 노트, frontmatter 없음)
// 동일 경로 존재 시 skip (중복 방지). 반환: 저장 경로 | null. throw 금지.
export function saveAlignmentKnowledge(
    brainPath: string,
    input: { userPrompt: string; qaList: Array<{ q: string; a: string }> },
): string | null
```
본문 형식:
```md
# {userPrompt 앞 50자}

> 1인 기업 모드 Intent Alignment에서 사용자가 직접 제공한 정보. ({date})

## 원본 요청
{userPrompt}

## 확인된 정보
### Q. {q}
{a}
```

### 3-B. 트리거 2곳 (fire-and-forget)
- `_runIntentAlignment`의 auto-proceed 분기: `_runCompanyTurn` 직전
- `_proceedWithCurrentAlignment`: `_runCompanyTurn` 직전
- 조건: `cfg.companyAlignmentKnowledgeSave !== false` && 사용자 직접 답변 존재
- try/catch + void (alignment 흐름 차단 금지), pixelOffice 로그 "💾 확인된 정보 두뇌 저장"

### 3-C. config
- `companyAlignmentKnowledgeSave: boolean` default true + package.json 노출

### 효과 (루프 완성)
저장된 노트는 일반 brain 노트 → 다음 turn에서 Phase 2 자가 조사가 TF-IDF로 발견 → 같은 질문 재발 시 스스로 해결 → 사용자에게 두 번 묻지 않음.

---

## 구현 순서
1. intentAlignment.ts (Phase 1-A) — 순수 추가
2. alignmentResearch.ts 신규 (2-A + 3-A) — 독립 모듈
3. config.ts + package.json (2-D + 3-C)
4. sidebarProvider.ts 통합 (1-B + 2-B + 3-B)
5. media/sidebar.js (2-C)
6. tests/alignmentResearch.test.ts 신규 — 파서 관용성, marker 필터, slug 생성, 저장 skip, 빈 두뇌 안전성
7. `npx tsc --noEmit` + `npm test`

## 리스크 및 완화
| 리스크 | 완화 |
|---|---|
| 작은 모델 토큰 폭주 (arch 16K) | 3,000자 재절단 |
| 자가 조사 LLM이 엉뚱한 답을 "answered" 처리 | 프롬프트에 "근거에 명시된 것만, 불확실하면 unanswered" + 근거 출처 표기 |
| 두뇌 오염 (자동 저장) | 사용자 직접 답변만, 20자 미만 제외, 전용 폴더, config로 off 가능 |
| alignment latency 증가 | 질문 존재 시에만 작동, 검색은 캐시된 인덱스 |
| 기존 테스트 파손 | SYSTEM_PROMPT 변경은 순수 추가, 기존 입력 필드 시그니처 유지 |