Files
connectai/docs/EXPERIENCE_MEMORY_PLAN.md
T

123 lines
9.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Experience Memory (Mistake / Lesson Loop) — Implementation Plan
> Goal: **Astra extracts reusable lessons from work results and QA feedback, and automatically
> turns them into a pre-task checklist so it stops falling into the same hole.**
>
> Not "make the model perfect" — "put a railing where the model keeps falling."
## Design principles (why this isn't just "more logging")
1. **Closed loop**: record → extract lesson → inject before next similar task → preflight checklist → (light) post-check.
Plain recording does not change future behavior; an *injected* lesson does.
2. **Reuse existing infrastructure — do NOT build a parallel system.** Astra already has:
- 5-layer memory (`src/memory/`): ProceduralMemory ("how to do X" — recipes), EpisodicMemory ("conversation flow"),
LongTermMemory ("user rules/decisions"), ProjectMemory, ShortTermMemory; `MemoryExtractor.onSessionEnd`.
- ProjectChronicle (`src/features/projectChronicle`): planning/development/bugs/retrospectives auto-records.
- `RetrievalOrchestrator` + `selectWithinBudget` (RAG), SecondBrainTrace.
- `src/retrieval/brainIndex.ts` — mtime-keyed token cache of **every `.md` in the active brain**.
⇒ Lessons are just markdown files **inside the active brain**, identified by `lessons/`-style path or
frontmatter `type: lesson|playbook|qa-finding`. They become retrievable *for free* via `searchBrainFiles`.
Distinction to keep clear: **ProceduralMemory = recipe ("how"), Lesson = guardrail ("what went wrong & how not to repeat")**.
3. **The risky half is *production* of lessons, not consumption.** Low-signal auto-records pollute retrieval.
So lesson *generation* is heavily gated; lesson *consumption* (retrieval + injection) is the cheap, safe half — build it first.
4. **Everything goes through the token budget.** Lessons compete with brain knowledge inside the same
~3200-token RAG allocation (and the small-model context cap). They get a modest score boost + a small
reserved sub-slot, **frequency-weighted, not recency-weighted** (a mistake violated 5× is louder than a one-off).
5. **Inspectable & correctable.** A bad auto-lesson that keeps getting injected = a poison source. The user must
see which lessons fed an answer (the per-answer scope footer) and be able to edit/delete/ignore one trivially.
## Data model
Lesson card = a markdown file in the active brain (convention: under `lessons/`, `playbooks/`, or `qa-findings/`,
or any file with the frontmatter below). The brain index (`brainIndex.ts`, version ≥ 3) stores a `kind` per file
parsed from path + frontmatter, so retrieval can tell lessons apart without re-reading content.
```md
---
type: lesson # lesson | playbook | qa-finding
title: Telegram remote execution must require allowlist
applies-to: [telegram, remote-execution, security, approval-flow]
project: ConnectAI # optional — scopes the lesson to one project
severity: high # low | medium | high
source: curated # curated | auto (curated weighted higher in retrieval)
occurrences: 1 # bumped on dedup-merge instead of creating a duplicate
last-seen: 2026-05-12
---
## Situation
## Mistake / Risk
## Root Cause
## Fix
## Prevention Checklist
-
-
## Applies To
- telegram
- remote-execution
-
```
## Pipeline (target end state)
| Stage | What | Trigger | Risk control |
|---|---|---|---|
| **Retrieve** | On a new request, retrieve relevant lessons (same RAG pipeline; `applies-to` tags + content matched, modest boost) | every turn | budget-bounded; capped to top-K |
| **Preflight** | Inject `[ACTIVE LESSONS — verify before finalizing]` block (Prevention Checklists) into the *protected* part of the system prompt | every turn with ≥1 lesson | placed before `[CONTEXT]` so it survives truncation |
| **Collect** | Capture changed files, run commands, failed logs, test red→green, rollbacks, approval rejections, user QA feedback as events | during the turn | already mostly in TransactionManager / approval queue |
| **Generate** | Build a *draft* lesson card from a strong-trigger turn | ① explicit QA feedback ② test red→green ③ `transactionManager.rollback()` ④ approval rejection ⑤ user says "기록해" — **never on a plain success** | quality bar (must have concrete Root Cause + checklist) or discard; **dedup → merge & bump `occurrences`**; **human confirms before persist (MVP)**; `source: auto`, lower weight |
| **Post-check (non-blocking)** | After the answer, flag in the footer any Prevention-Checklist item not visibly addressed | every turn with lessons | **never re-prompt / never block** — just a footer warning. (Blocking gate only for "risky ops" via the existing dryRun approval flow, v2+) |
Hard caps: ≤ N lessons injected per turn; ≤ M total `source: auto` lessons before requiring cleanup
(mirrors `brainIndex` 12k cap). Cross-project bleed prevented by `project:` tag + ProjectMemory-style scoping.
## MVP — build the *consumption* side first
1. **Lesson detection in the brain index**`brainIndex.ts` parses `kind` from path/frontmatter; stored, version-bumped.
2. **Lesson-aware retrieval**`searchBrainFiles` marks lesson chunks (`metadata.isLesson`), uses a larger excerpt
(whole card if short), gives a modest score boost. `RetrievalOrchestrator.retrieve()` splits lesson chunks out
into `result.lessonChunks`.
3. **Preflight injection**`agent.buildMemoryContext` prepends an `[ACTIVE LESSONS — verify before finalizing]`
block (built from lesson cards / their Prevention-Checklist sections) ahead of the normal RAG context, inside
the truncation-protected zone. Non-lesson RAG context unchanged.
4. **Visibility** — the per-answer "참조 범위" footer (already shipped) shows `· 교훈 N개` and hover lists the files;
clicking opens the agent↔knowledge mapping editor (or, later, the lesson files).
5. **Manual seeding**`g1nation.lesson.create` command: asks for a title, writes `<brain>/lessons/<slug>.md`
from the card template, opens it. This is also the seed for the future auto-generator.
> With (1)(5) you can hand-write 35 high-value lessons today and immediately verify "the next similar task behaves
> differently", before any auto-generation risk exists.
## v2 — status
Done:
- **Triggers → "기록할까요?" prompt** (human-confirm UI, never auto-saves): `transactionManager.rollback()` (action failure), `rejectTransaction()` (user rejected a dry-run change), and QA-feedback (user message matches `isQaRegressionFeedback` — "또 안 돼", "비슷한 실수", "왜 반복돼", "고쳤는데 깨졌", "regression", "why … keep failing", …). All post a `lessonCandidate` webview message → sidebar shows a dismissible box → "📝 교훈 기록" runs `g1nation.lesson.fromConversation` (pre-filled Situation).
- **`occurrences` dedup-merge**: `createLessonCard` checks existing lessons by normalized title; on match it offers "갱신 (occurrences +1)" which runs `bumpLessonOccurrences` (increments `occurrences:`, sets `last-seen:`) instead of spawning a duplicate. Recurring mistake → louder, not more numerous.
- **Non-blocking post-QA flag**: `findUnaddressedChecklistItems(answer, lessonCards)` — lesson Prevention-Checklist items whose significant terms don't appear in the answer are listed in the per-answer footer (`⚠ 답변에서 안 보이는 교훈 체크리스트 항목: …`). No re-prompt, no block.
- **Manage / delete / ignore UI**: `g1nation.lesson.manage` QuickPick (lists all lesson cards from the brain via the index; open on select; trash button → confirm + delete = no longer injected). The footer's `⚠ 교훈 N` is clickable → opens this picker.
Not done (next):
- Auto-retrospective draft on a *successful* turn (heavily gated, `source: auto`, lower retrieval weight). — deliberately last; high noise risk.
- test red→green trigger — **blocked**: `<run_command>` actions run in a VS Code terminal (`terminal.sendText`) with no output capture, so Astra can't observe test results. Needs a captured-output execution path first.
- `source: auto` vs `curated` retrieval-weight distinction; archive one-off lessons unused > 6 months.
- Reserved lesson sub-budget within the RAG allocation; frequency-weighted (`occurrences`) ordering of injected lessons.
## Integration points (files)
- `src/retrieval/lessonHelpers.ts` *(new, pure)*`LESSON_DIR_RE`, `detectLessonKind(relativePath, content)`, `buildLessonChecklistBlock(chunks)`, `lessonTemplate(title)`.
- `src/retrieval/brainIndex.ts``IndexEntry.kind`, version bump, populate via `detectLessonKind`; expose `kind` on `IndexedBrainDoc`.
- `src/retrieval/types.ts``RetrievalChunk.metadata.isLesson?`, `RetrievalResult.lessonChunks?`.
- `src/retrieval/index.ts``searchBrainFiles` lesson handling; `retrieve()` splits lesson chunks.
- `src/agent.ts``buildMemoryContext` prepends the lessons block; `_lastRetrievalInfo.lessonFiles`; `usedScope` message carries it.
- `media/sidebar.js` — scope footer shows `· 교훈 N개`.
- `package.json` + `src/extension.ts``g1nation.lesson.create` command.
- *(v2)* `src/features/projectChronicle`, `TransactionManager`, approval queue — trigger hooks; `MemoryManager`/`ProceduralMemory` — optional structured layer.