---
id: ai-code-agent-patterns
title: Code Agent — Devin / Cursor / Claude Code
category: Coding
status: draft
source_trust_level: B
verification_status: conceptual
created_at: 2026-05-09
updated_at: 2026-05-09
tags: [ai, agent, code, vibe-coding]
tech_stack: { language: "TS / Python", applicable_to: ["AI"] }
applied_in: []
aliases: [Devin, Cursor, Claude Code, code agent, autonomous coding, SWE-bench, Aider]
---

# Code Agent

> AI 가 code 작성 / 수정. **Cursor (IDE), Claude Code (CLI), Devin (autonomous), Aider (open)**. SWE-bench 가 eval.

## 📖 핵심 개념
- File read / edit / run tools.
- Plan + execute + verify.
- Sandbox (E2B / Daytona).
- Test 가 ground truth.

## 💻 코드 패턴

### Aider (open source CLI)
```bash
pip install aider-chat
cd my-repo
aider --model claude-opus-4-7

# 명령
> add a /health endpoint to server.ts
# → 자동 edit + commit.
```

→ Git-aware. 매 변경 = commit.

### Cursor (IDE)
```
- Composer (multi-file).
- Cmd-K (inline).
- Tab autocomplete.
- @file / @docs / @web reference.
```

→ VS Code fork.

### Claude Code (CLI)
```bash
claude
# → terminal-based agent.
```

→ Anthropic 의 native.

### Tool composition
```
- read_file
- write_file
- edit_file (precise diff)
- bash (run command)
- search (grep / file)
- web (fetch)
```

### Plan → Execute
```
1. User: "Add OAuth login".
2. Plan: 5 step (read existing auth, add provider, route, ...).
3. Execute: 매 step → read / edit / test.
4. Verify: tests pass?
5. Commit.
```

### Test-driven
```
1. Read existing test.
2. Add new test (RED).
3. Implement (GREEN).
4. Refactor.
5. Verify (re-run).
```

→ TDD with AI.

### File reading strategy
```
- 직접 path: fast.
- Search (grep): fuzzy.
- Symbol search: function / class.
- Recent change: git log.

→ Context 가 limited. 정확 retrieve.
```

### Edit precision
```
Two strategies:
1. Full file rewrite: 큰 file = waste.
2. Diff / patch: 정확 + cheap.

→ Aider / Claude Code 가 diff.
Cursor 가 mix.
```

### Sandbox execution
```
- Test run (E2B / Modal).
- Build verify.
- Lint / type check.

→ Real verification.
```

### SWE-bench
```
- 2294 real GitHub issue.
- "Fix this issue" given repo + issue.
- Pass = test 가 새 + 옛 둘 다 OK.

→ State-of-the-art:
- 2023: 2-3% pass.
- 2024: 30-50% pass.
- 2026: 60%+ pass.
```

→ 매년 ↑.

### Devin (autonomous)
```
- 큰 task 가 hours.
- Browser + code + plan.
- 사람 review 가 필요.
```

→ "AI software engineer".

### Limitations
```
- Long context (큰 codebase) = lost.
- 매우 새 / niche 기술.
- Mathematical / algorithmic.
- Multi-step refactor.
- Production debugging (real-time).
```

→ 사람 가 still 필요.

### Best practice (사용자)
```
- Small task: agent.
- Big task: human plan + agent execute.
- Critical / security: human review.
- Test 가 baseline.
- Git commit 가 자주 (rollback).
```

### Prompt 가이드
```
✓ "Fix bug in users.ts:42 — null check 가 missing for empty list".
✗ "Make better".

✓ "Add /health endpoint returning {status: 'ok'}".
✗ "Add monitoring".

→ Specific. 명확.
```

### Multi-file edit
```
"Refactor all UserService.* call to NewUserService".

→ Agent 가:
1. Search 모든 caller.
2. 매 file 의 edit.
3. Run test.
4. Iterate.
```

### Code review by AI
```
PR review:
- Read diff.
- Comment specific.
- Suggest improvement.
- Find bug.

→ CodeRabbit / Greptile / Sourcery.
```

### Production agent
```
- Agentic IDE (Cursor, Windsurf, Zed AI).
- CLI (Claude Code, Aider).
- Autonomous (Devin, Cognition).
- PR review (CodeRabbit).
- Inline (Copilot, Tabnine).
```

### MCP (Model Context Protocol)
```ts
// Anthropic 의 표준.
// Editor 가 server expose: file, git, run.
// Agent 가 같은 protocol.
```

→ Cursor / Claude Desktop / Cline 가 native.

→ [[AI_MCP_Server_Building]].

### Eval
```
Internal:
- Unit test pass rate.
- Refactor preservation.
- Code style 일관.

Public:
- SWE-bench, HumanEval, BigCodeBench.
```

### Multi-agent (subagent)
```
Main agent:
- Plan + coordinate.

Sub-agent:
- Search file.
- Run test.
- Refactor specific.

→ Anthropic 의 Claude Code / multi-agent.
```

### Memory
```
- Conversation context.
- File contents.
- Git history.
- Test results.

→ Memory budget. Compress.
```

### 함정
```
- Agent 가 infinite loop.
- Cost 폭발 (큰 task = 매 hour $).
- Test 없음 = silent break.
- Git 없음 = rollback 불가.
- Human review 없음 = production bug.
```

### Workflow integration
```
1. GitHub issue → agent.
2. PR open → agent review.
3. Test fail → agent fix.
4. Human approve → merge.

→ AI + human loop.
```

### Privacy
```
- 코드 가 model API server.
- Local model (Ollama) = privacy.
- Code-only fine-tune (CodeLlama).

→ Sensitive code = local / private cloud.
```

### Cost
```
1 day Claude Code: $5-50 (tokens).
Cursor Pro: $20 / month.
Devin: $500 / month.
Aider + own API: $10-100 / month.

→ Productivity ↑ 가 cost 정당화.
```

### Future
```
- 점점 autonomous.
- Test-first AI.
- Long-context (1M+ token).
- Multi-modal (UI screenshot → code).
- Fine-tuned per-codebase.
```

## 🤔 의사결정 기준
| 작업 | 추천 |
|---|---|
| Quick edit | Cursor / Copilot |
| Multi-file | Cursor Composer / Claude Code |
| Autonomous | Devin (review 필요) |
| Open / CLI | Aider |
| PR review | CodeRabbit / Greptile |
| Test write | Any agent + test |
| Refactor | Aider / Cursor |

## ❌ 안티패턴
- **Agent 만 + no review**: production bug.
- **No test**: silent break.
- **No git commit**: rollback X.
- **Vague prompt**: bad output.
- **Long-running 무 supervision**: cost / loop.
- **Sensitive code 가 public model**: leak.

## 🤖 LLM 활용 힌트
- Cursor / Claude Code 가 modern default.
- Test = ground truth.
- Aider 가 git-aware open.
- MCP 가 표준 protocol.

## 🔗 관련 문서
- [[AI_Tool_Composition_Deep]]
- [[AI_Multi_Agent_Coordination]]
- [[AI_Agent_Sandbox_E2B]]