Files
connectai/docs/PROJECT_CHRONICLE_GUARD_ROADMAP.md
T
2026-05-05 11:30:30 +09:00

1.6 KiB

Project Chronicle Guard: Search Engine Roadmap

🎯 Current Status: v2.74.0

  • Phase 1: Linguistic Foundation Stabilization (Completed)
  • Phase 2: Conflict Scoring Refinement (Completed)
  • Phase 3: Performance Scaling & Caching (In Progress)
  • Phase 4: Excerpt Precision Tuning (Completed)
  • Phase 5: Downstream Integration API (Planned)

🔬 Phase Details

Phase 1: Linguistic Foundation (v2.72.0 - v2.74.0)

  • Goal: Perfect tokenization for mixed KO/EN/Special characters.
  • Achievement:
    • Bilingual boundary split (e.g., 'Astra의' -> 'Astra', '의').
    • Hangeul monosyllable preservation (e.g., '한', '글').
    • Zero-width character cleaning.

Phase 2: Conflict Scoring (v2.73.0 - v2.74.0)

  • Goal: Quantitative risk assessment for information conflicts.
  • Achievement:
    • Tiered severity logic (NONE, LOW, MEDIUM, HIGH).
    • Substring-based detection to overcome particle interference.
    • Configurable thresholds via SCORING_CONFIG.

Phase 3: Performance Scaling (v2.75.0+)

  • Goal: Sub-10ms response for 10k+ documents.
  • Action:
    • Global module-level caching for IDF and tokens.
    • Potential worker thread offloading for heavy scoring.

Phase 4: Excerpt Precision (Planned)

  • Goal: Maximize context signal-to-noise ratio.
  • Action:
    • Density-based window starting point restriction.
    • Multi-stage filtering for optimal text chunking.

Phase 5: Integration (Planned)

  • Goal: Seamless RAG pipeline integration.
  • Action:
    • Strict IO schema definition for downstream AI agents.