8.0 KiB
8.0 KiB
id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
| id | title | category | status | source_trust_level | verification_status | created_at | updated_at | tags | tech_stack | applied_in | aliases | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ai-agent-sandbox-e2b | Agent Sandbox — E2B / Daytona / code execution | Coding | draft | B | conceptual | 2026-05-09 | 2026-05-09 |
|
|
|
Agent Sandbox
LLM 가 임의 code 실행 = 위험. E2B / Daytona / Modal sandbox = isolated 환경. AI agent (Devin, Cursor) 의 핵심.
📖 핵심 개념
- VM / container 격리.
- 언어 / package install OK.
- File system 접근.
- Time / resource limit.
💻 코드 패턴
E2B (sandbox-as-a-service)
import { Sandbox } from '@e2b/code-interpreter';
const sandbox = await Sandbox.create();
const result = await sandbox.runCode(`
import pandas as pd
df = pd.read_csv('data.csv')
print(df.head())
`);
console.log(result.text);
console.log(result.error);
console.log(result.results); // [{type: 'text', text: '...'}]
await sandbox.close();
→ Python / Node 가 cloud sandbox.
File upload
await sandbox.files.write('data.csv', csvContent);
// Or stream
await sandbox.files.upload('./local.csv', '/sandbox/data.csv');
// Read
const content = await sandbox.files.read('/sandbox/output.txt');
Package install
await sandbox.runCode('pip install numpy scikit-learn');
const r = await sandbox.runCode(`
from sklearn.cluster import KMeans
# ...
`);
Long-running task
const sandbox = await Sandbox.create({
timeoutMs: 30 * 60 * 1000, // 30 min
});
await sandbox.runCode(longWork);
Stream output
const result = await sandbox.runCode('for i in range(100): print(i); time.sleep(0.1)', {
onStdout: (data) => console.log(data),
onStderr: (data) => console.error(data),
});
Daytona (alternative)
import { Daytona } from '@daytonaio/sdk';
const daytona = new Daytona({ apiKey: '...' });
const workspace = await daytona.create({
image: 'python:3.12',
});
await workspace.process.codeRun('print("hello")');
Modal (Python)
import modal
stub = modal.Stub('agent-sandbox')
@stub.function()
def run_code(code: str):
import subprocess
return subprocess.run(['python', '-c', code], capture_output=True)
with stub.run():
r = run_code.remote('print("hi")')
→ Python-native, GPU 가능.
Docker (self-host)
import Docker from 'dockerode';
const docker = new Docker();
const container = await docker.createContainer({
Image: 'python:3.12',
Cmd: ['python', '-c', code],
HostConfig: {
Memory: 512 * 1024 * 1024,
CpuQuota: 50000,
NetworkMode: 'none', // 격리
},
});
await container.start();
const log = await container.logs({ stdout: true, stderr: true });
await container.remove();
→ Self-host. Setup 복잡.
Firecracker (microVM)
AWS Lambda 가 사용.
- VM 보다 빠른 boot (125ms).
- 매 invocation = fresh VM.
- 강한 격리.
→ E2B / Daytona 가 자체 .
Use case
- Code interpreter (ChatGPT)
- Data analysis (pandas, plot)
- Research agent (web scrape + process)
- Devin / Cursor (개발 환경)
- Math / 정확 (LLM 가 부족)
- Image / video (FFmpeg, Sharp)
- Custom training (small model)
LLM tool 통합
// LLM 가 sandbox tool 호출
const tools = [{
name: 'run_python',
description: 'Run Python code in sandbox',
input_schema: { type: 'object', properties: { code: { type: 'string' } } },
}];
async function execute(toolUse) {
if (toolUse.name === 'run_python') {
const r = await sandbox.runCode(toolUse.input.code);
return r.text;
}
}
→ LLM 가 "compute X" → sandbox 실행 → 결과.
Persistence
E2B: 매 sandbox 가 ephemeral (close = 잃음).
Persistent: 옵션.
→ 매 user 가 own sandbox 가능.
const sandbox = await Sandbox.create({ template: 'persistent', metadata: { userId: 'alice' } });
Cost
E2B: $ / sandbox-second.
Daytona: 비슷.
Modal: $ / GPU-second.
Self-host: VM 가 cost.
→ Pay per use 가 idle 안 비싼.
Security
- Network 차단 (default).
- File system 격리 (per sandbox).
- CPU / memory limit.
- Time limit.
- 사용자 별 격리.
Network access (옵션)
await sandbox.runCode('curl https://example.com', { network: true });
→ Default = 차단. Whitelist 가능.
File output (download)
const code = `
import pandas as pd
df = pd.read_csv('input.csv')
df.to_csv('output.csv')
`;
await sandbox.files.write('input.csv', csv);
await sandbox.runCode(code);
const result = await sandbox.files.read('output.csv');
Plot / chart
const r = await sandbox.runCode(`
import matplotlib.pyplot as plt
plt.plot([1, 2, 3])
plt.savefig('/sandbox/chart.png')
`);
const img = await sandbox.files.readBytes('chart.png');
→ LLM 의 chart generation.
Dev environment
// Devin / Cursor 식
const env = await Sandbox.create({
template: 'node-vscode',
envs: { GITHUB_TOKEN: '...' },
});
await env.runCode('git clone https://github.com/repo');
await env.runCode('npm install');
await env.runCode('npm test');
→ LLM 가 dev workflow.
LangChain / LlamaIndex 통합
from langchain_experimental.tools import PythonAstREPLTool
# Local — 위험.
from e2b_code_interpreter import CodeInterpreter
sandbox = CodeInterpreter()
→ E2B 가 langchain 친화.
함정
- Network access 가 폭발 (DDoS, leak).
- File 가 sensitive (e.g. /etc/passwd) — 격리.
- Resource limit 없음 = cluster 죽임.
- Persistent state 가 다른 user 에 leak.
- Cost 폭발 (long-running).
Eval
LLM 가 "Compute fibonacci(20)":
- Without sandbox: 자기 reasoning (틀림).
- With sandbox: 실행 (정확).
→ Math / 정확 task 가 sandbox 의 큰 가치.
Production
Cursor: 큰 sandbox cluster.
Devin: 매 task 가 own VM.
ChatGPT Code Interpreter: OpenAI infra (Kubernetes).
Claude Computer Use: Anthropic / VM.
Custom template
// E2B template
e2b template build -n my-sandbox -d Dockerfile
const sandbox = await Sandbox.create({ template: 'my-sandbox' });
→ Pre-built env (libs, tools).
Multi-language
await sandbox.runCode('console.log("hi")', { language: 'js' });
await sandbox.runCode('print("hi")', { language: 'python' });
await sandbox.runCode('echo "hi"', { language: 'bash' });
Browser inside sandbox
// E2B + Playwright
await sandbox.runCode(`
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto('https://...')
page.screenshot(path='ss.png')
`);
→ Browser agent (위 AI_Browser_Agent_Patterns).
MCP server + sandbox
Anthropic MCP 의 server 가 sandbox 호출.
- LLM → MCP → E2B → result.
- Native integration.
작은 alternative
- vm2 (Node, deprecated 가 위험).
- isolated-vm (Node, native).
- WebContainer (browser-side).
- StackBlitz WebContainer (Node in browser).
→ Production = E2B / Daytona / Modal.
🤔 의사결정 기준
| 작업 | 추천 |
|---|---|
| Code interpreter | E2B / Daytona |
| Heavy compute / GPU | Modal |
| 작은 / browser | WebContainer |
| Self-host | Docker + limits |
| Dev environment | E2B / Daytona / Cursor cloud |
| Quick eval | Modal Python |
| Multi-language | E2B (built-in) |
| Persistent | Persistent template |
❌ 안티패턴
- Local exec (eval, exec): 매우 위험.
- Network access default: leak.
- No resource limit: cluster 죽임.
- Persistent + cross-user: data leak.
- No timeout: hang.
- Sensitive env in sandbox: leak.
- LLM 가 sandbox 안 넣음: hallucinate result.
🤖 LLM 활용 힌트
- E2B 가 most popular agent sandbox.
- Modal 가 GPU / Python 친화.
- Docker self-host 가 cost-effective + 복잡.
- Network / file / resource 격리 필수.