Files

T

Antigravity Agent 504fd5fb42 [G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00

12 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, inferred_by, tech_stack, applied_in

title

AI Connect LLM Tool (ConnectAI)

📌 한 줄 통찰 (The Karpathy Summary)

100% local + offline VS Code AI coding agent. Ollama / LM Studio 의 hardware 직접 사용 — 외부 server X. File edit + terminal + Second Brain (knowledge base) 통합. 기업 보안 / privacy 친화 의 internal tool.

📖 구조화된 지식 (Synthesized Content)

핵심 가치

100% local: 매 LLM call 가 사용자 의 machine. Cloud API X.
Privacy-first: code / prompt 가 외부 X. 기업 / 의료 / 법적 case 의 답.
Hardware-aware: 매 사용자 의 GPU / RAM 의 best fit model.
VS Code native: extension API 의 deep 통합.
Second Brain: 매 codebase / wiki / personal note 의 RAG.

비교 (with cloud-based)

	ConnectAI	Cursor / Claude Code
Privacy	100% local	Cloud API
Cost	Hardware 만	$20-50 / month
Latency	Local GPU 의존	Network
Quality	Local model 의 한계 (Llama 8B-70B)	Frontier (Opus, GPT-4)
Offline	Yes	No
Setup	Ollama / LM Studio + GPU	Pay + login
매 변경	Manual update	Server-side (자동)

→ Privacy / cost / offline 가 critical = ConnectAI. Quality / 빠른 setup = Cursor / Claude Code.

Architecture

VS Code Extension (TS): UI + sidebar + command.
Local LLM Engine: Ollama 또는 LM Studio.
Tool Registry: file_read / file_write / shell / search.
Second Brain: 매 wiki / note 의 vector DB (local).
Agent Loop: ReAct 식 (think → act → observe).

Local LLM 옵션

Ollama: 작은 / simple. CLI 친화. Mac M-series 강력.
LM Studio: GUI. 매 model 의 quantize / VRAM 측정.
vLLM (advanced): production. 큰 model + batching.
llama.cpp: 가장 simple. Mobile / embedded.

Model 선택 (hardware 따라)

RAM / VRAM	추천 model
8 GB	Llama 3.2 3B (Q4)
16 GB	Llama 3.1 8B (Q4) / Mistral 7B
24 GB	Llama 3.1 8B (FP16) / Qwen 14B
32 GB	DeepSeek Coder 33B (Q4)
48 GB	Llama 3 70B (Q4)
96 GB+	Llama 3 70B (FP16) / DeepSeek V3

→ Mac M3 Max 96 GB 가 sweet (Llama 70B 가 fit).

Sidebar Chat UI

Streaming response (token 별).
File reference (@file).
Multi-turn conversation.
Code block 의 apply / insert.
Settings (model, temperature, system prompt).

Tool 목록

read_file(path): file content.
write_file(path, content): write / create.
edit_file(path, oldText, newText): precise diff.
run_command(cmd): terminal — 사용자 confirm.
search_codebase(query): ripgrep / regex.
query_brain(question): vector DB.

LM Studio 통합 (lifecycle)

Model 선택 → load (warm GPU).
Idle 5 min → unload (VRAM 회수).
매 chat 시 → 자동 reload.

→ User 의 다른 work (game) 의 GPU 충돌 방지.

Second Brain (RAG)

Wiki / note 의 vector embed (local model).
매 query 의 top-K retrieval.
LLM context 에 inject.
Privacy: 모든 거 local.

💻 코드 패턴 (Code Patterns)

Extension activation

// src/extension.ts
import * as vscode from 'vscode';

export function activate(context: vscode.ExtensionContext) {
  const provider = new SidebarChatProvider(context);
  
  context.subscriptions.push(
    vscode.window.registerWebviewViewProvider('connectai.sidebar', provider),
    vscode.commands.registerCommand('connectai.chat', () => provider.show()),
  );
}

LLM call (Ollama)

async function chat(prompt: string, model: string) {
  const r = await fetch('http://localhost:11434/api/chat', {
    method: 'POST',
    body: JSON.stringify({
      model,
      messages: [{ role: 'user', content: prompt }],
      stream: true,
    }),
  });
  
  const reader = r.body!.getReader();
  let buffer = '';
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    buffer += new TextDecoder().decode(value);
    
    let idx;
    while ((idx = buffer.indexOf('\n')) >= 0) {
      const line = buffer.slice(0, idx);
      buffer = buffer.slice(idx + 1);
      if (!line.trim()) continue;
      
      const chunk = JSON.parse(line);
      if (chunk.message?.content) yield chunk.message.content;
    }
  }
}

LM Studio 통합 (lifecycle manager)

import { LMStudioClient } from '@lmstudio/sdk';

class ModelLifecycleManager {
  private client = new LMStudioClient({ baseUrl: 'http://localhost:1234' });
  private currentModel?: string;
  private idleTimer?: NodeJS.Timeout;
  
  async onModelSelected(modelKey: string) {
    if (this.idleTimer) clearTimeout(this.idleTimer);
    if (this.currentModel === modelKey) return;
    
    if (this.currentModel) await this.client.llm.unload(this.currentModel);
    await this.client.llm.load(modelKey);
    this.currentModel = modelKey;
    
    this.scheduleIdleUnload();
  }
  
  onActivity() {
    if (this.idleTimer) {
      clearTimeout(this.idleTimer);
      this.scheduleIdleUnload();
    }
  }
  
  private scheduleIdleUnload() {
    const timeout = vscode.workspace.getConfiguration('connectai').get<number>('idleTimeoutMs', 300_000);
    if (timeout <= 0) return;
    
    this.idleTimer = setTimeout(async () => {
      if (this.currentModel) {
        await this.client.llm.unload(this.currentModel);
        this.currentModel = undefined;
      }
    }, timeout);
  }
}

Tool execution (file edit)

async function editFile(path: string, oldText: string, newText: string) {
  const uri = vscode.Uri.file(path);
  const doc = await vscode.workspace.openTextDocument(uri);
  
  const text = doc.getText();
  const idx = text.indexOf(oldText);
  if (idx === -1) throw new Error('oldText not found');
  
  const edit = new vscode.WorkspaceEdit();
  const start = doc.positionAt(idx);
  const end = doc.positionAt(idx + oldText.length);
  edit.replace(uri, new vscode.Range(start, end), newText);
  
  await vscode.workspace.applyEdit(edit);
}

Run command (with user confirmation)

async function runCommand(cmd: string): Promise<string> {
  // Always ask user first
  const ok = await vscode.window.showWarningMessage(
    `Run command: ${cmd}?`,
    { modal: true },
    'Yes', 'No'
  );
  
  if (ok !== 'Yes') return 'cancelled';
  
  const term = vscode.window.createTerminal('ConnectAI');
  term.show();
  term.sendText(cmd);
  // Wait + capture output (separate logic).
  return await waitForOutput(term);
}

Second Brain (RAG)

import { ChromaClient } from 'chromadb';
const chroma = new ChromaClient({ path: 'http://localhost:8000' });

async function queryBrain(question: string): Promise<string[]> {
  const collection = await chroma.getCollection({ name: 'wiki' });
  const emb = await embedLocal(question);   // Ollama embedding model
  
  const results = await collection.query({
    queryEmbeddings: [emb],
    nResults: 5,
  });
  
  return results.documents[0];
}

Configuration

// .vscode/settings.json
{
  "connectai.engine": "lmstudio",  // "ollama" | "lmstudio"
  "connectai.ollamaUrl": "http://localhost:11434",
  "connectai.lmStudioUrl": "http://localhost:1234",
  "connectai.defaultModel": "llama-3.1-8b-instruct",
  "connectai.lmStudio.idleTimeoutMs": 300000,
  "connectai.lmStudio.autoLoadOnSelect": true
}

🤔 의사결정 기준 (Decision Criteria)

상황	ConnectAI	Cursor / Claude Code
Sensitive code (의료, 금융, 정부)	✅ ConnectAI	❌
Quality 우선 (frontier model)	❌	✅
Offline 작업	✅	❌
매월 cost ↓	✅	❌ ($20+/month)
빠른 setup	❌ (model download)	✅
Multi-file refactor	작은 model 의 한계	✅
Air-gapped	✅	❌

기본값: Privacy / offline 가 hard requirement → ConnectAI. Productivity / quality 우선 → Cursor / Claude Code.

⚠️ 모순 및 업데이트 (Contradictions & Updates)

Quality gap: Local 70B 가 Cloud Opus 보다 약. 매 task 의 reality check.
Hardware cost: M3 Max + 96 GB = $4000+. ROI 가 매월 cloud subscription 와 비교 (1-2 year breakeven).
Architecture: 현재 monolithic (extension.ts heavy) → modular 권장. lmstudio module 의 분리 가 best practice.
run_command security: 매 user 의 confirmation 가 critical. 자동 실행 = system 위험.
Model lifecycle: 옛 = 매 chat 의 load (slow). 모던 = persistent + idle eject (LM Studio 통합).

🔗 지식 연결 (Graph)

관련 tool: Ollama · LM-Studio · vLLM · llama.cpp
VS Code: VS-Code-Extension-API · Webview-Provider · Tree-Sitter-Integration
Cloud alternative: Cursor-Workflow-Patterns · Claude-Code · GitHub-Copilot
Local LLM: Local-LLM-Inference · Quantization-GGUF · Model-Selection-Hardware
RAG: Vector-DB-Local · ChromaDB · LanceDB · Embedding-Strategy-Deep
Lifecycle: Model-Loading-Memory-Management · GPU-Memory-Pressure
적용: Antigravity-Project · Connect-AI-Lab · EZERAI-Infrastructure

🤖 LLM 활용 힌트 (How to Use This Knowledge)

언제 이 지식을 쓰는가:

기업 의 internal AI tool 의 design.
Privacy-sensitive code 의 AI assist.
ConnectAI 의 새 기능 / refactor.
Local LLM 의 hardware sizing.
LM Studio / Ollama 통합.
Second Brain / RAG architecture.

언제 쓰면 안 되는가:

매 dev 가 cloud OK + cost 가 OK = Cursor 가 더 좋음.
매우 큰 codebase 의 mass refactor = 큰 model (Opus / GPT-4) 가 quality.
Quick prototype — setup overhead.
사용자 의 hardware 가 부족 — 매 model 가 slow.

❌ 안티패턴 (Anti-Patterns)

run_command 자동 실행: 매 LLM 의 hallucination = rm -rf 위험.
Monolithic extension.ts: 매 feature 의 추가 시 maintainability ↓. Module 화.
No idle eject: VRAM 영구 점유 → 다른 work 의 GPU contention.
Cloud model 의 fallback 가 default: privacy 의 가치 X.
Embedding 가 cloud (OpenAI): privacy violation.
No tool whitelisting: shell + file 의 unrestricted = 사고.
Quality 의 cloud-comparison expectation: 매 user 의 gap 인지.

🧪 검증 상태 (Validation)

정보 상태: verified (applied — Antigravity 프로젝트 의 active dev).
출처 신뢰도: A (project's primary tool).
검토 이유: Manual cleanup. 매 architecture 의 implementation detail 가 ConnectAI repo 와 sync.

🧬 중복 검사 (Duplicate Check)

기존 유사 문서: Local-LLM-Inference (concept), VS-Code-Extension-Patterns (technical), AI-Code-Agent-Patterns (general).
처리 방식: KEEP (specific tool 의 documentation).
처리 이유: ConnectAI 의 own design / architecture 가 distinct.

🕓 변경 이력 (Changelog)

날짜	변경 내용	처리 방식	신뢰도
2026-05-08	P-Reinforce Phase 1 정규화	UPDATE	A
2026-05-09	Manual cleanup — code pattern + lifecycle 통합 + 의사결정 기준 + 안티패턴 추가	UPDATE	A

12 KiB Raw Blame History

AI Connect LLM Tool (ConnectAI)

📌 한 줄 통찰 (The Karpathy Summary)

📖 구조화된 지식 (Synthesized Content)

핵심 가치

비교 (with cloud-based)

Architecture

Local LLM 옵션

Model 선택 (hardware 따라)

Sidebar Chat UI

Tool 목록

LM Studio 통합 (lifecycle)

Second Brain (RAG)

💻 코드 패턴 (Code Patterns)

Extension activation

LLM call (Ollama)

LM Studio 통합 (lifecycle manager)

Tool execution (file edit)

Run command (with user confirmation)

Second Brain (RAG)

Configuration

🤔 의사결정 기준 (Decision Criteria)

⚠️ 모순 및 업데이트 (Contradictions & Updates)

🔗 지식 연결 (Graph)

🤖 LLM 활용 힌트 (How to Use This Knowledge)

❌ 안티패턴 (Anti-Patterns)

🧪 검증 상태 (Validation)

🧬 중복 검사 (Duplicate Check)

🕓 변경 이력 (Changelog)

12 KiB

Raw Blame History