Files
2nd/10_Wiki/Topics/Coding/AI_Code_Agent_Patterns.md
T
2026-05-10 22:08:15 +09:00

5.8 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
ai-code-agent-patterns Code Agent — Devin / Cursor / Claude Code Coding draft B conceptual 2026-05-09 2026-05-09
ai
agent
code
vibe-coding
language applicable_to
TS / Python
AI
Devin
Cursor
Claude Code
code agent
autonomous coding
SWE-bench
Aider

Code Agent

AI 가 code 작성 / 수정. Cursor (IDE), Claude Code (CLI), Devin (autonomous), Aider (open). SWE-bench 가 eval.

📖 핵심 개념

  • File read / edit / run tools.
  • Plan + execute + verify.
  • Sandbox (E2B / Daytona).
  • Test 가 ground truth.

💻 코드 패턴

Aider (open source CLI)

pip install aider-chat
cd my-repo
aider --model claude-opus-4-7

# 명령
> add a /health endpoint to server.ts
# → 자동 edit + commit.

→ Git-aware. 매 변경 = commit.

Cursor (IDE)

- Composer (multi-file).
- Cmd-K (inline).
- Tab autocomplete.
- @file / @docs / @web reference.

→ VS Code fork.

Claude Code (CLI)

claude
# → terminal-based agent.

→ Anthropic 의 native.

Tool composition

- read_file
- write_file
- edit_file (precise diff)
- bash (run command)
- search (grep / file)
- web (fetch)

Plan → Execute

1. User: "Add OAuth login".
2. Plan: 5 step (read existing auth, add provider, route, ...).
3. Execute: 매 step → read / edit / test.
4. Verify: tests pass?
5. Commit.

Test-driven

1. Read existing test.
2. Add new test (RED).
3. Implement (GREEN).
4. Refactor.
5. Verify (re-run).

→ TDD with AI.

File reading strategy

- 직접 path: fast.
- Search (grep): fuzzy.
- Symbol search: function / class.
- Recent change: git log.

→ Context 가 limited. 정확 retrieve.

Edit precision

Two strategies:
1. Full file rewrite: 큰 file = waste.
2. Diff / patch: 정확 + cheap.

→ Aider / Claude Code 가 diff.
Cursor 가 mix.

Sandbox execution

- Test run (E2B / Modal).
- Build verify.
- Lint / type check.

→ Real verification.

SWE-bench

- 2294 real GitHub issue.
- "Fix this issue" given repo + issue.
- Pass = test 가 새 + 옛 둘 다 OK.

→ State-of-the-art:
- 2023: 2-3% pass.
- 2024: 30-50% pass.
- 2026: 60%+ pass.

→ 매년 ↑.

Devin (autonomous)

- 큰 task 가 hours.
- Browser + code + plan.
- 사람 review 가 필요.

→ "AI software engineer".

Limitations

- Long context (큰 codebase) = lost.
- 매우 새 / niche 기술.
- Mathematical / algorithmic.
- Multi-step refactor.
- Production debugging (real-time).

→ 사람 가 still 필요.

Best practice (사용자)

- Small task: agent.
- Big task: human plan + agent execute.
- Critical / security: human review.
- Test 가 baseline.
- Git commit 가 자주 (rollback).

Prompt 가이드

✓ "Fix bug in users.ts:42 — null check 가 missing for empty list".
✗ "Make better".

✓ "Add /health endpoint returning {status: 'ok'}".
✗ "Add monitoring".

→ Specific. 명확.

Multi-file edit

"Refactor all UserService.* call to NewUserService".

→ Agent 가:
1. Search 모든 caller.
2. 매 file 의 edit.
3. Run test.
4. Iterate.

Code review by AI

PR review:
- Read diff.
- Comment specific.
- Suggest improvement.
- Find bug.

→ CodeRabbit / Greptile / Sourcery.

Production agent

- Agentic IDE (Cursor, Windsurf, Zed AI).
- CLI (Claude Code, Aider).
- Autonomous (Devin, Cognition).
- PR review (CodeRabbit).
- Inline (Copilot, Tabnine).

MCP (Model Context Protocol)

// Anthropic 의 표준.
// Editor 가 server expose: file, git, run.
// Agent 가 같은 protocol.

→ Cursor / Claude Desktop / Cline 가 native.

AI_MCP_Server_Building.

Eval

Internal:
- Unit test pass rate.
- Refactor preservation.
- Code style 일관.

Public:
- SWE-bench, HumanEval, BigCodeBench.

Multi-agent (subagent)

Main agent:
- Plan + coordinate.

Sub-agent:
- Search file.
- Run test.
- Refactor specific.

→ Anthropic 의 Claude Code / multi-agent.

Memory

- Conversation context.
- File contents.
- Git history.
- Test results.

→ Memory budget. Compress.

함정

- Agent 가 infinite loop.
- Cost 폭발 (큰 task = 매 hour $).
- Test 없음 = silent break.
- Git 없음 = rollback 불가.
- Human review 없음 = production bug.

Workflow integration

1. GitHub issue → agent.
2. PR open → agent review.
3. Test fail → agent fix.
4. Human approve → merge.

→ AI + human loop.

Privacy

- 코드 가 model API server.
- Local model (Ollama) = privacy.
- Code-only fine-tune (CodeLlama).

→ Sensitive code = local / private cloud.

Cost

1 day Claude Code: $5-50 (tokens).
Cursor Pro: $20 / month.
Devin: $500 / month.
Aider + own API: $10-100 / month.

→ Productivity ↑ 가 cost 정당화.

Future

- 점점 autonomous.
- Test-first AI.
- Long-context (1M+ token).
- Multi-modal (UI screenshot → code).
- Fine-tuned per-codebase.

🤔 의사결정 기준

작업 추천
Quick edit Cursor / Copilot
Multi-file Cursor Composer / Claude Code
Autonomous Devin (review 필요)
Open / CLI Aider
PR review CodeRabbit / Greptile
Test write Any agent + test
Refactor Aider / Cursor

안티패턴

  • Agent 만 + no review: production bug.
  • No test: silent break.
  • No git commit: rollback X.
  • Vague prompt: bad output.
  • Long-running 무 supervision: cost / loop.
  • Sensitive code 가 public model: leak.

🤖 LLM 활용 힌트

  • Cursor / Claude Code 가 modern default.
  • Test = ground truth.
  • Aider 가 git-aware open.
  • MCP 가 표준 protocol.

🔗 관련 문서