--- id: ai-code-agent-patterns title: Code Agent — Devin / Cursor / Claude Code category: Coding status: draft source_trust_level: B verification_status: conceptual created_at: 2026-05-09 updated_at: 2026-05-09 tags: [ai, agent, code, vibe-coding] tech_stack: { language: "TS / Python", applicable_to: ["AI"] } applied_in: [] aliases: [Devin, Cursor, Claude Code, code agent, autonomous coding, SWE-bench, Aider] --- # Code Agent > AI 가 code 작성 / 수정. **Cursor (IDE), Claude Code (CLI), Devin (autonomous), Aider (open)**. SWE-bench 가 eval. ## 📖 핵심 개념 - File read / edit / run tools. - Plan + execute + verify. - Sandbox (E2B / Daytona). - Test 가 ground truth. ## 💻 코드 패턴 ### Aider (open source CLI) ```bash pip install aider-chat cd my-repo aider --model claude-opus-4-7 # 명령 > add a /health endpoint to server.ts # → 자동 edit + commit. ``` → Git-aware. 매 변경 = commit. ### Cursor (IDE) ``` - Composer (multi-file). - Cmd-K (inline). - Tab autocomplete. - @file / @docs / @web reference. ``` → VS Code fork. ### Claude Code (CLI) ```bash claude # → terminal-based agent. ``` → Anthropic 의 native. ### Tool composition ``` - read_file - write_file - edit_file (precise diff) - bash (run command) - search (grep / file) - web (fetch) ``` ### Plan → Execute ``` 1. User: "Add OAuth login". 2. Plan: 5 step (read existing auth, add provider, route, ...). 3. Execute: 매 step → read / edit / test. 4. Verify: tests pass? 5. Commit. ``` ### Test-driven ``` 1. Read existing test. 2. Add new test (RED). 3. Implement (GREEN). 4. Refactor. 5. Verify (re-run). ``` → TDD with AI. ### File reading strategy ``` - 직접 path: fast. - Search (grep): fuzzy. - Symbol search: function / class. - Recent change: git log. → Context 가 limited. 정확 retrieve. ``` ### Edit precision ``` Two strategies: 1. Full file rewrite: 큰 file = waste. 2. Diff / patch: 정확 + cheap. → Aider / Claude Code 가 diff. Cursor 가 mix. ``` ### Sandbox execution ``` - Test run (E2B / Modal). - Build verify. - Lint / type check. → Real verification. ``` ### SWE-bench ``` - 2294 real GitHub issue. - "Fix this issue" given repo + issue. - Pass = test 가 새 + 옛 둘 다 OK. → State-of-the-art: - 2023: 2-3% pass. - 2024: 30-50% pass. - 2026: 60%+ pass. ``` → 매년 ↑. ### Devin (autonomous) ``` - 큰 task 가 hours. - Browser + code + plan. - 사람 review 가 필요. ``` → "AI software engineer". ### Limitations ``` - Long context (큰 codebase) = lost. - 매우 새 / niche 기술. - Mathematical / algorithmic. - Multi-step refactor. - Production debugging (real-time). ``` → 사람 가 still 필요. ### Best practice (사용자) ``` - Small task: agent. - Big task: human plan + agent execute. - Critical / security: human review. - Test 가 baseline. - Git commit 가 자주 (rollback). ``` ### Prompt 가이드 ``` ✓ "Fix bug in users.ts:42 — null check 가 missing for empty list". ✗ "Make better". ✓ "Add /health endpoint returning {status: 'ok'}". ✗ "Add monitoring". → Specific. 명확. ``` ### Multi-file edit ``` "Refactor all UserService.* call to NewUserService". → Agent 가: 1. Search 모든 caller. 2. 매 file 의 edit. 3. Run test. 4. Iterate. ``` ### Code review by AI ``` PR review: - Read diff. - Comment specific. - Suggest improvement. - Find bug. → CodeRabbit / Greptile / Sourcery. ``` ### Production agent ``` - Agentic IDE (Cursor, Windsurf, Zed AI). - CLI (Claude Code, Aider). - Autonomous (Devin, Cognition). - PR review (CodeRabbit). - Inline (Copilot, Tabnine). ``` ### MCP (Model Context Protocol) ```ts // Anthropic 의 표준. // Editor 가 server expose: file, git, run. // Agent 가 같은 protocol. ``` → Cursor / Claude Desktop / Cline 가 native. → [[AI_MCP_Server_Building]]. ### Eval ``` Internal: - Unit test pass rate. - Refactor preservation. - Code style 일관. Public: - SWE-bench, HumanEval, BigCodeBench. ``` ### Multi-agent (subagent) ``` Main agent: - Plan + coordinate. Sub-agent: - Search file. - Run test. - Refactor specific. → Anthropic 의 Claude Code / multi-agent. ``` ### Memory ``` - Conversation context. - File contents. - Git history. - Test results. → Memory budget. Compress. ``` ### 함정 ``` - Agent 가 infinite loop. - Cost 폭발 (큰 task = 매 hour $). - Test 없음 = silent break. - Git 없음 = rollback 불가. - Human review 없음 = production bug. ``` ### Workflow integration ``` 1. GitHub issue → agent. 2. PR open → agent review. 3. Test fail → agent fix. 4. Human approve → merge. → AI + human loop. ``` ### Privacy ``` - 코드 가 model API server. - Local model (Ollama) = privacy. - Code-only fine-tune (CodeLlama). → Sensitive code = local / private cloud. ``` ### Cost ``` 1 day Claude Code: $5-50 (tokens). Cursor Pro: $20 / month. Devin: $500 / month. Aider + own API: $10-100 / month. → Productivity ↑ 가 cost 정당화. ``` ### Future ``` - 점점 autonomous. - Test-first AI. - Long-context (1M+ token). - Multi-modal (UI screenshot → code). - Fine-tuned per-codebase. ``` ## 🤔 의사결정 기준 | 작업 | 추천 | |---|---| | Quick edit | Cursor / Copilot | | Multi-file | Cursor Composer / Claude Code | | Autonomous | Devin (review 필요) | | Open / CLI | Aider | | PR review | CodeRabbit / Greptile | | Test write | Any agent + test | | Refactor | Aider / Cursor | ## ❌ 안티패턴 - **Agent 만 + no review**: production bug. - **No test**: silent break. - **No git commit**: rollback X. - **Vague prompt**: bad output. - **Long-running 무 supervision**: cost / loop. - **Sensitive code 가 public model**: leak. ## 🤖 LLM 활용 힌트 - Cursor / Claude Code 가 modern default. - Test = ground truth. - Aider 가 git-aware open. - MCP 가 표준 protocol. ## 🔗 관련 문서 - [[AI_Tool_Composition_Deep]] - [[AI_Multi_Agent_Coordination]] - [[AI_Agent_Sandbox_E2B]]