[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -1,110 +1,296 @@
 ---
-id: wiki-2026-0508-artifacts-infrastructure
-title: "Artifacts & Infrastructure"
+id: wiki-2026-0508-artifacts-and-infrastructure
+title: Artifacts & Infrastructure (Agentic Systems)
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: []
+aliases: [agent artifacts, sandbox, microVM, container isolation, agent infrastructure, artifact store]
 duplicate_of: none
-source_trust_level: A
-confidence_score: 0.92
-tags: [uncategorized]
+source_trust_level: B
+confidence_score: 0.88
+verification_status: applied
+tags: [agent, infrastructure, sandbox, docker, microvm, artifacts, e2b, modal, fly-machines, agent-harness]
 raw_sources: []
-last_reinforced: 2026-05-08
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
 tech_stack:
-  language: unspecified
-  framework: unspecified
+  language: TypeScript / Python
+  framework: Docker / Firecracker / E2B / Modal / Fly Machines
 ---

-# Artifacts & Infrastructure (아티팩트 및 인프라)
+# Artifacts & Infrastructure

-## 📌 한 줄 통찰 (The Karpathy Summary)
-Artifacts & Infrastructure는 에이전트가 생성한 중간 산출물(코드, 문서, 이미지 등)을 체계적으로 저장, 색인, 관리하는 체계와 이를 뒷받침하는 물리적/가상적 실행 환경을 의미한다. 에이전트의 사고 과정을 증명하고 결과물을 공유하며, 안전한 실행을 보장하는 에이전틱 시스템의 물리적 토대이다.
+## 📌 한 줄 통찰
+> **"매 agent 의 physical body"**. 매 produced output (code, doc, image) 의 store + index + version. 매 execution 의 sandbox (container / microVM). 매 modern agent system 의 backbone — 매 E2B / Modal / Fly Machines.

-## 📖 구조화된 지식 (Synthesized Content)
-*   **아티팩트 시스템 (Artifact Store)**:
-    *   **Filesystem-Artifact Store**: 모델 컨텍스트에 담기 힘든 대규모 데이터를 별도의 파일 시스템에 저장하고 모델에게는 참조 ID와 요약본만 제공.
-    *   **Artifact Index**: 저장된 수많은 아티팩트를 검색하고 추적하기 위한 메타데이터 색인 시스템.
-    *   **버전 관리**: 아티팩트의 변경 이력을 관리하여 에이전트가 이전 버전으로 롤백하거나 변경 사항을 비교할 수 있게 함.
-*   **실행 인프라 (Infrastructure)**:
-    *   **Docker**: 표준화된 컨테이너 환경에서 도구와 라이브러리를 실행.
-    *   **MicroVM**: 컨테이너보다 강력한 보안 격리가 필요한 경우 사용하는 초경량 가상 머신.
-    *   **Sandboxed Execution**: 에이전트의 활동을 호스트 시스템으로부터 물리적으로 분리하여 보호.
-*   **아티팩트 시각화**: 에이전트가 생성한 결과물(React UI, Mermaid 다이어그램 등)을 사용자가 즉시 확인하고 상호작용할 수 있도록 렌더링하는 인터페이스 제공.
+## 📖 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
-*   **저장 공간 및 관리 비용**: 에이전트가 생성하는 아티팩트가 많아질수록 저장 공간이 급증하고 이를 관리하는 인프라 비용이 늘어난다.
-*   **데이터 일관성**: 아티팩트 저장소의 데이터와 에이전트의 메모리(S-component) 간에 정보가 불일치할 경우 에이전트가 혼란을 겪을 수 있다.
-*   **격리와 성능의 균형**: 샌드박싱이 강화될수록 실행 속도는 느려지고 외부 시스템과의 연동은 복잡해진다.
+### 매 artifact 의 종류
+1. **Code**: file, snippet, PR.
+2. **Document**: markdown, JSON, structured.
+3. **Media**: image, video, audio.
+4. **Data**: dataset, embedding.
+5. **Trace**: thought process log.

-## 🔗 지식 연결 (Graph)
-### Related Concepts
-*   [[Agent Harness|Agent Harness]]
-    *   연결 이유: 아티팩트 스토어와 인프라는 하네스의 물리적 구현 대상이다.
-*   [[Execution Environment (Sandbox)|Execution Environment (Sandbox)]]
-    *   연결 이유: 인프라 계층에서 제공하는 핵심적인 보안 기능이다.
-*   [[C-component (Context Manager)|C-component (Context Manager)]]
-    *   연결 이유: 대규모 데이터를 아티팩트로 오프로딩하여 컨텍스트 부패를 방지한다.
+### 매 artifact store 의 component
+- **Storage**: S3 / Minio / FS.
+- **Metadata**: id, type, parent, hash, timestamp.
+- **Index**: search (Elasticsearch / SQLite FTS).
+- **Versioning**: content-addressed (Git-like) or sequential.
+- **Access control**: per-user / per-agent.

-### Deeper Research Questions
-*   에이전트가 생성한 아티팩트 중 '영구 보존'이 필요한 가치 있는 것과 '임시 산출물'을 자동으로 구분하여 관리하는 생명주기 정책은 무엇인가?
-*   아티팩트 저장소를 분산 환경에서 여러 에이전트가 지연 시간 없이 공유하기 위한 고성능 캐싱 전략은 무엇인가?
-*   아티팩트 자체에 포함된 보안 위협(예: 악성 스크립트 포함 코드)을 자동으로 스캔하고 정제하는 인프라 수준의 보안 기술은 무엇인가?
+### 매 reference vs full
+- 매 model context 의 limit → 매 reference id + summary 만 의 inject.
+- 매 full content 의 explicit fetch.
+- 매 attention budget 의 conserve.

-### Practical Application Contexts
-*   **Implementation:** 에이전트가 코드를 작성하면 즉시 `.html` 파일로 저장하고, 사용자의 브라우저에서 이를 실시간으로 미리보기(Preview) 할 수 있는 파이프라인을 구축한다.
-*   **System Design:** 아티팩트 저장소로 AWS S3나 로컬 미니오(Minio)를 활용하고, 메타데이터 관리를 위해 ElasticSearch나 SQL DB를 연동한다.
+### 매 execution infrastructure

---
-*Last updated: 2026-05-01*
+#### Container (Docker)
+- 매 standardized environment.
+- 매 image immutable.
+- 매 namespace isolation (PID, network, mount).
+- 매 cgroups resource limit.
+- ✅ 매 fast.
+- ❌ 매 kernel share (security weak).

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+#### MicroVM (Firecracker)
+- 매 lightweight VM.
+- 매 hardware-virtualized.
+- 매 boot < 125 ms.
+- ✅ 매 strong isolation.
+- ❌ 매 slightly slower.
+- 매 AWS Lambda / Fly Machines 사용.

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+#### gVisor (Google)
+- 매 user-space kernel.
+- 매 syscall intercept.
+- 매 between container + VM.

-**언제 쓰면 안 되는가:**
- *(TODO)*
+#### WebAssembly (Wasm)
+- 매 sandbox by design.
+- 매 fast startup.
+- 매 language-agnostic.
+- 매 limited syscall.

-## 🧪 검증 상태 (Validation)
+### 매 modern agent infra
+- **E2B**: 매 Firecracker-based, 매 agent-focused.
+- **Modal**: 매 Python serverless + GPU.
+- **Fly Machines**: 매 microVM, 매 global.
+- **CodeSandbox**: 매 sandbox dev env.
+- **Replit Agent**: 매 in-IDE.
+- **Daytona**: 매 dev environment.

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+### 매 artifact lifecycle
+1. **Create**: 매 agent 가 produce.
+2. **Store**: 매 artifact store.
+3. **Index**: 매 metadata + content search.
+4. **Reference**: 매 future agent 의 cite.
+5. **Version**: 매 update / rollback.
+6. **Garbage collect**: 매 unused / TTL.

-## 🧬 중복 검사 (Duplicate Check)
+### 매 visualization
+- **HTML preview**: React, plain.
+- **Mermaid**: diagram.
+- **Markdown**: doc.
+- **CSV / Table**: data.
+- **Image / Video**: media.
+- **3D**: glb / gltf.

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+→ 매 user 의 immediate verification.

-## 🕓 변경 이력 (Changelog)
+### 매 trade-off
+- **Storage cost**: 매 retention policy.
+- **Indexing latency**: 매 fast write 의 lazy index.
+- **Isolation strength**: 매 security ↑ → 매 perf ↓.
+- **Cold start**: 매 sandbox 의 fast boot.
+- **Secret management**: 매 leak 방지.

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+### 매 security
+- **Network egress**: 매 whitelist.
+- **Filesystem**: 매 read-only base + writable scratch.
+- **Resource limit** (CPU, memory, disk, time).
+- **Syscall filter** (seccomp).
+- **Secret injection**: 매 env var, 매 vault.
+- **Output scanning**: 매 secret leak detect.

-## 💻 코드 패턴 (Code Patterns)
+## 💻 패턴

-**패턴 1:** *(TODO: 이 프로젝트 컨벤션 반영한 구조 스켈레톤)*
+### Artifact store (FS-based)
+```ts
+import { createHash } from 'crypto';
+import * as fs from 'fs/promises';

-```text
-# TODO
+class ArtifactStore {
+  async write(content: string | Buffer, metadata: Record<string, any>) {
+    const hash = createHash('sha256').update(content).digest('hex');
+    const path = `./artifacts/${hash.slice(0, 2)}/${hash}`;
+    await fs.mkdir(path.split('/').slice(0, -1).join('/'), { recursive: true });
+    await fs.writeFile(path, content);
+    
+    await this.indexMetadata(hash, metadata);
+    return { id: hash, path };
+  }
+  
+  async read(id: string): Promise<{ content: Buffer; metadata: any }> {
+    const path = `./artifacts/${id.slice(0, 2)}/${id}`;
+    const [content, metadata] = await Promise.all([
+      fs.readFile(path),
+      this.fetchMetadata(id),
+    ]);
+    return { content, metadata };
+  }
+  
+  async indexMetadata(id: string, metadata: any) {
+    // 매 SQLite / Elasticsearch
+    await db.insert('artifacts', { id, ...metadata, ts: Date.now() });
+  }
+}
 ```

-## 🤔 의사결정 기준 (Decision Criteria)
+### E2B sandbox (Python)
+```python
+from e2b import Sandbox

-**선택 A를 써야 할 때:**
- *(TODO)*
+sandbox = Sandbox.create('python3')
+result = sandbox.run_code("""
+import pandas as pd
+df = pd.DataFrame({'a': [1, 2, 3]})
+print(df.sum())
+""")
+print(result.text)  # 매 stdout
+print(result.results)  # 매 plotted image, table

-**선택 B를 써야 할 때:**
- *(TODO)*
+sandbox.close()
+```

-**기본값:**
-> *(TODO)*
+### Modal (serverless GPU)
+```python
+import modal

-## ❌ 안티패턴 (Anti-Patterns)
+app = modal.App('my-agent')
+image = modal.Image.debian_slim().pip_install('transformers', 'torch')

- **[안티패턴]:** *(TODO: 무엇을 하면 안 되는가 + 이유 + 대신 무엇을)*
+@app.function(image=image, gpu='A10G', timeout=600)
+def run_inference(prompt: str) -> str:
+    from transformers import pipeline
+    pipe = pipeline('text-generation', model='meta-llama/Llama-3-8B')
+    return pipe(prompt)[0]['generated_text']
+
+@app.local_entrypoint()
+def main():
+    result = run_inference.remote('Hello')
+    print(result)
+```
+
+### Docker sandbox (limited)
+```python
+import docker
+
+client = docker.from_env()
+
+def run_in_sandbox(code: str, language: str = 'python', timeout: int = 30):
+    container = client.containers.run(
+        f'sandbox-{language}',
+        f'python -c "{code}"',
+        mem_limit='512m',
+        cpu_quota=50000,  # 매 0.5 CPU
+        network_disabled=True,
+        read_only=True,
+        tmpfs={'/tmp': 'size=64m'},
+        security_opt=['no-new-privileges'],
+        cap_drop=['ALL'],
+        detach=True,
+    )
+    try:
+        container.wait(timeout=timeout)
+        return container.logs().decode()
+    finally:
+        container.remove(force=True)
+```
+
+### Fly Machines (microVM)
+```bash
+fly machine run python:3.11 \
+  --region sfo \
+  --vm-cpus 2 \
+  --vm-memory 1024 \
+  --env API_KEY=$API_KEY \
+  -- python /app/agent.py
+```
+
+### Mermaid artifact preview
+```ts
+function renderMermaidArtifact(diagram: string): string {
+  return `
+    <html><body>
+      <pre class="mermaid">${escapeHtml(diagram)}</pre>
+      <script src="https://cdn.jsdelivr.net/npm/mermaid/dist/mermaid.min.js"></script>
+      <script>mermaid.initialize({ startOnLoad: true });</script>
+    </body></html>
+  `;
+}
+```
+
+### Secret leak detector
+```python
+import re
+
+SECRET_PATTERNS = [
+    re.compile(r'AKIA[0-9A-Z]{16}'),  # AWS
+    re.compile(r'sk-[a-zA-Z0-9]{32,}'),  # OpenAI
+    re.compile(r'github_pat_[a-zA-Z0-9_]{82}'),
+    re.compile(r'-----BEGIN (RSA |EC )?PRIVATE KEY-----'),
+]
+
+def scan_for_secrets(artifact_content: str) -> list[str]:
+    findings = []
+    for pattern in SECRET_PATTERNS:
+        for match in pattern.findall(artifact_content):
+            findings.append(redact(match))
+    return findings
+```
+
+## 🤔 결정 기준
+| 요구 | Infra |
+|---|---|
+| Untrusted code | E2B / Firecracker |
+| Trusted Python | Modal |
+| Long-running | Fly Machines |
+| Light isolation | Docker + seccomp |
+| Browser-side | Wasm |
+| Code preview | HTML iframe sandbox |
+| Permanent artifact | S3 + content-addressed |
+| Ephemeral | tmpfs + TTL |
+
+**기본값**: E2B (untrusted) + Modal (trusted) + S3 artifact store + content-hash dedup.
+
+## 🔗 Graph
+- 부모: [[Agent-Architecture]] · [[Cloud-Infrastructure]]
+- 변형: [[Sandbox]] · [[Container]] · [[MicroVM]] · [[Wasm]]
+- 응용: [[E2B]] · [[Modal]] · [[Fly-Machines]] · [[Firecracker]] · [[gVisor]]
+- Adjacent: [[Agent-Harness]] · [[Context-Manager]] · [[Tool-Use]] · [[Code-Execution]]
+
+## 🤖 LLM 활용
+**언제**: 매 agent system design. 매 sandbox selection. 매 artifact store schema. 매 security review.
+**언제 X**: 매 single trusted user (over-engineering).
+
+## ❌ 안티패턴
+- **Run untrusted in host**: 매 RCE.
+- **No resource limit**: 매 fork bomb.
+- **Network unrestricted**: 매 data exfil.
+- **Secret in env (logged)**: 매 leak.
+- **No TTL**: 매 storage bloat.
+- **Full content in context**: 매 attention waste.
+- **Container 의 security 의 over-trust**: 매 kernel CVE.
+
+## 🧪 검증 / 중복
+- Verified (E2B, Modal, Firecracker, AWS Lambda papers).
+- 신뢰도 B.
+- Related: [[Agent-Harness]] · [[Sandbox]] · [[E2B]] · [[Modal]] · [[Code-Execution]].
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — sandbox spectrum + lifecycle + 매 E2B / Modal / Docker / Fly code |