9.2 KiB
Experience Memory (Mistake / Lesson Loop) — Implementation Plan
Goal: Astra extracts reusable lessons from work results and QA feedback, and automatically turns them into a pre-task checklist so it stops falling into the same hole.
Not "make the model perfect" — "put a railing where the model keeps falling."
Design principles (why this isn't just "more logging")
- Closed loop: record → extract lesson → inject before next similar task → preflight checklist → (light) post-check. Plain recording does not change future behavior; an injected lesson does.
- Reuse existing infrastructure — do NOT build a parallel system. Astra already has:
- 5-layer memory (
src/memory/): ProceduralMemory ("how to do X" — recipes), EpisodicMemory ("conversation flow"), LongTermMemory ("user rules/decisions"), ProjectMemory, ShortTermMemory;MemoryExtractor.onSessionEnd. - ProjectChronicle (
src/features/projectChronicle): planning/development/bugs/retrospectives auto-records. RetrievalOrchestrator+selectWithinBudget(RAG), SecondBrainTrace.src/retrieval/brainIndex.ts— mtime-keyed token cache of every.mdin the active brain. ⇒ Lessons are just markdown files inside the active brain, identified bylessons/-style path or frontmattertype: lesson|playbook|qa-finding. They become retrievable for free viasearchBrainFiles. Distinction to keep clear: ProceduralMemory = recipe ("how"), Lesson = guardrail ("what went wrong & how not to repeat").
- 5-layer memory (
- The risky half is production of lessons, not consumption. Low-signal auto-records pollute retrieval. So lesson generation is heavily gated; lesson consumption (retrieval + injection) is the cheap, safe half — build it first.
- Everything goes through the token budget. Lessons compete with brain knowledge inside the same ~3200-token RAG allocation (and the small-model context cap). They get a modest score boost + a small reserved sub-slot, frequency-weighted, not recency-weighted (a mistake violated 5× is louder than a one-off).
- Inspectable & correctable. A bad auto-lesson that keeps getting injected = a poison source. The user must see which lessons fed an answer (the per-answer scope footer) and be able to edit/delete/ignore one trivially.
Data model
Lesson card = a markdown file in the active brain (convention: under lessons/, playbooks/, or qa-findings/,
or any file with the frontmatter below). The brain index (brainIndex.ts, version ≥ 3) stores a kind per file
parsed from path + frontmatter, so retrieval can tell lessons apart without re-reading content.
---
type: lesson # lesson | playbook | qa-finding
title: Telegram remote execution must require allowlist
applies-to: [telegram, remote-execution, security, approval-flow]
project: ConnectAI # optional — scopes the lesson to one project
severity: high # low | medium | high
source: curated # curated | auto (curated weighted higher in retrieval)
occurrences: 1 # bumped on dedup-merge instead of creating a duplicate
last-seen: 2026-05-12
---
## Situation
…
## Mistake / Risk
…
## Root Cause
…
## Fix
…
## Prevention Checklist
- …
- …
## Applies To
- telegram
- remote-execution
- …
Pipeline (target end state)
| Stage | What | Trigger | Risk control |
|---|---|---|---|
| Retrieve | On a new request, retrieve relevant lessons (same RAG pipeline; applies-to tags + content matched, modest boost) |
every turn | budget-bounded; capped to top-K |
| Preflight | Inject [ACTIVE LESSONS — verify before finalizing] block (Prevention Checklists) into the protected part of the system prompt |
every turn with ≥1 lesson | placed before [CONTEXT] so it survives truncation |
| Collect | Capture changed files, run commands, failed logs, test red→green, rollbacks, approval rejections, user QA feedback as events | during the turn | already mostly in TransactionManager / approval queue |
| Generate | Build a draft lesson card from a strong-trigger turn | ① explicit QA feedback ② test red→green ③ transactionManager.rollback() ④ approval rejection ⑤ user says "기록해" — never on a plain success |
quality bar (must have concrete Root Cause + checklist) or discard; dedup → merge & bump occurrences; human confirms before persist (MVP); source: auto, lower weight |
| Post-check (non-blocking) | After the answer, flag in the footer any Prevention-Checklist item not visibly addressed | every turn with lessons | never re-prompt / never block — just a footer warning. (Blocking gate only for "risky ops" via the existing dryRun approval flow, v2+) |
Hard caps: ≤ N lessons injected per turn; ≤ M total source: auto lessons before requiring cleanup
(mirrors brainIndex 12k cap). Cross-project bleed prevented by project: tag + ProjectMemory-style scoping.
MVP — build the consumption side first
- Lesson detection in the brain index —
brainIndex.tsparseskindfrom path/frontmatter; stored, version-bumped. - Lesson-aware retrieval —
searchBrainFilesmarks lesson chunks (metadata.isLesson), uses a larger excerpt (whole card if short), gives a modest score boost.RetrievalOrchestrator.retrieve()splits lesson chunks out intoresult.lessonChunks. - Preflight injection —
agent.buildMemoryContextprepends an[ACTIVE LESSONS — verify before finalizing]block (built from lesson cards / their Prevention-Checklist sections) ahead of the normal RAG context, inside the truncation-protected zone. Non-lesson RAG context unchanged. - Visibility — the per-answer "참조 범위" footer (already shipped) shows
· 교훈 N개and hover lists the files; clicking opens the agent↔knowledge mapping editor (or, later, the lesson files). - Manual seeding —
g1nation.lesson.createcommand: asks for a title, writes<brain>/lessons/<slug>.mdfrom the card template, opens it. This is also the seed for the future auto-generator.
With (1)–(5) you can hand-write 3–5 high-value lessons today and immediately verify "the next similar task behaves differently", before any auto-generation risk exists.
v2 — status
Done:
- Triggers → "기록할까요?" prompt (human-confirm UI, never auto-saves):
transactionManager.rollback()(action failure),rejectTransaction()(user rejected a dry-run change), and QA-feedback (user message matchesisQaRegressionFeedback— "또 안 돼", "비슷한 실수", "왜 반복돼", "고쳤는데 깨졌", "regression", "why … keep failing", …). All post alessonCandidatewebview message → sidebar shows a dismissible box → "📝 교훈 기록" runsg1nation.lesson.fromConversation(pre-filled Situation). occurrencesdedup-merge:createLessonCardchecks existing lessons by normalized title; on match it offers "갱신 (occurrences +1)" which runsbumpLessonOccurrences(incrementsoccurrences:, setslast-seen:) instead of spawning a duplicate. Recurring mistake → louder, not more numerous.- Non-blocking post-QA flag:
findUnaddressedChecklistItems(answer, lessonCards)— lesson Prevention-Checklist items whose significant terms don't appear in the answer are listed in the per-answer footer (⚠ 답변에서 안 보이는 교훈 체크리스트 항목: …). No re-prompt, no block. - Manage / delete / ignore UI:
g1nation.lesson.manageQuickPick (lists all lesson cards from the brain via the index; open on select; trash button → confirm + delete = no longer injected). The footer's⚠ 교훈 Nis clickable → opens this picker.
Not done (next):
- Auto-retrospective draft on a successful turn (heavily gated,
source: auto, lower retrieval weight). — deliberately last; high noise risk. - test red→green trigger — blocked:
<run_command>actions run in a VS Code terminal (terminal.sendText) with no output capture, so Astra can't observe test results. Needs a captured-output execution path first. source: autovscuratedretrieval-weight distinction; archive one-off lessons unused > 6 months.- Reserved lesson sub-budget within the RAG allocation; frequency-weighted (
occurrences) ordering of injected lessons.
Integration points (files)
src/retrieval/lessonHelpers.ts(new, pure) —LESSON_DIR_RE,detectLessonKind(relativePath, content),buildLessonChecklistBlock(chunks),lessonTemplate(title).src/retrieval/brainIndex.ts—IndexEntry.kind, version bump, populate viadetectLessonKind; exposekindonIndexedBrainDoc.src/retrieval/types.ts—RetrievalChunk.metadata.isLesson?,RetrievalResult.lessonChunks?.src/retrieval/index.ts—searchBrainFileslesson handling;retrieve()splits lesson chunks.src/agent.ts—buildMemoryContextprepends the lessons block;_lastRetrievalInfo.lessonFiles;usedScopemessage carries it.media/sidebar.js— scope footer shows· 교훈 N개.package.json+src/extension.ts—g1nation.lesson.createcommand.- (v2)
src/features/projectChronicle,TransactionManager, approval queue — trigger hooks;MemoryManager/ProceduralMemory— optional structured layer.