"매 LLM 의 first-pass mechanical review + human 의 second-pass design judgment 의 layered combine". 매 2026 standard PR workflow 의 Claude Code Review / GitHub Copilot review / CodeRabbit 의 의 자동 lint-style + security + style critique 의 emit, 매 human reviewer 의 의 architecture / business / API design 의 final call 의 reserve — 매 cycle time 의 50%+ 의 reduce 한 이후 quality 의 maintain.
매 핵심
매 division of labor
AI 의 strength (first pass): lint, type-check, test coverage gap, security CVE pattern, naming consistency, docstring missing, obvious bug, regex bug, off-by-one, null deref candidate.
Human 의 strength (second pass): architecture fit, business logic correctness, API design taste, performance trade-off, team convention, mentoring, ambiguity resolution.
Overlap zone: 매 both 의 의 catch — security critical / public API change.
매 workflow stage
Stage 1 (PR open): CI lint + AI bot 의 inline comment 의 emit.
Stage 2 (author response): 매 AI suggestion 의 author 의 accept / reject / mark "see human review".
Stage 3 (human review): 매 human 의 의 AI noise 의 skip — 매 design / intent 의 focus.
Stage 4 (merge gate): 매 critical AI finding 의 unresolved → block; non-critical → warn only.
매 응용
Claude Code /review slash command 의 PR diff review.
GitHub Actions + Anthropic API 의 PR comment 자동.
CodeRabbit / Greptile 의 contextual review.
Pre-commit local AI lint (claude-code, cursor).
Security-focused AI scanner (Snyk + LLM, Semgrep + LLM).
// scripts/ai-review.ts
importAnthropicfrom'@anthropic-ai/sdk';import{execSync}from'child_process';constclient=newAnthropic();constdiff=execSync('git diff origin/main...HEAD').toString();constres=awaitclient.messages.create({model:'claude-opus-4-7',max_tokens: 4096,system:[{type:'text',text: REVIEW_RUBRIC,cache_control:{type:'ephemeral'}// 매 prompt cache — 매 rubric 의 reuse
}],messages:[{role:'user',content:`Review this diff. Output JSON array of findings.\n\n${diff}`}]});constfindings=JSON.parse(extractJson(res.content[0].text));postPRComments(findings);
Review rubric (cached system prompt)
constREVIEW_RUBRIC=`
You are a senior code reviewer. For each issue output:
{
"file": string,
"line": number,
"severity": "critical" | "major" | "minor" | "nit",
"category": "security" | "bug" | "perf" | "style" | "test",
"message": string,
"suggestion": string // optional patch
}
Critical (block merge):
- SQL injection, XSS, path traversal, secret leak
- Null deref on user-reachable path
- Missing auth/authz check
Skip (do NOT comment):
- Style preferences without lint rule
- Architectural opinions (human's job)
- Speculative perf without measurement
`;
# require-human-on-critical.yml- name:Check AI critical findingsrun:| CRITICAL=$(jq '[.[] | select(.severity=="critical")] | length' findings.json)
if [ "$CRITICAL" -gt 0 ]; then
gh pr edit $PR --add-label "needs-human-review"
exit 1
fi
Local pre-commit hook
#!/bin/bash
# .git/hooks/pre-commit — 매 staged diff 의 local AI quick-checkDIFF=$(git diff --cached)[ -z "$DIFF"]&&exit0
claude --print --model claude-haiku-4-5 \
"Review this staged diff for obvious bugs. Reply DONE if clean, else list issues:\n$DIFF"\
| tee /tmp/precommit-review.txt
grep -qi "^DONE" /tmp/precommit-review.txt
Reviewer dashboard (signal vs noise)
// 매 human 의 의 AI suggestion 의 accept rate 의 track — 매 rubric tuning loop
interfaceReviewMetric{prNumber: number;aiFindings: number;humanAccepted: number;// resolved as "good catch"
humanDismissed: number;// marked "noise"
humanMissed: number;// human found, AI didn't
}// 매 acceptRate < 30% 의 rubric 의 too noisy — tighten.
// 매 humanMissed > 0 의 rubric 의 too narrow — broaden.
매 결정 기준
상황
Approach
Small team / OSS
Claude Code action (zero-config)
Enterprise / private
Self-hosted Anthropic API + GH Actions
Latency-critical
Pre-commit Haiku quick-check
Security-heavy
Semgrep + LLM context layer
Design-heavy review
Skip AI, pure human
기본값: Claude Code Action 의 PR + human reviewer 의 of design.
언제: 매 mechanical pass (lint, security, naming, docstring, test gap); 매 PR description 자동 generate; 매 commit message rewrite.
언제 X: 매 architecture decision; 매 API design; 매 business logic correctness; 매 team mentoring — 매 human 의 final call.
❌ 안티패턴
Rubber stamp: 매 AI suggestion 의 의 blind accept — 매 false positive 의 ship.
AI noise flood: 매 every nit 의 comment — 매 reviewer fatigue.
Bypass human: 매 AI green = merge — 매 design rot.
No prompt cache: 매 매 PR 의 의 large rubric 의 re-send — cost 10×.
Public diff leak: 매 private code 의 의 unconfigured 3rd-party AI 의 send — 매 secrets policy 의 violate.
🧪 검증 / 중복
Verified (Anthropic Claude Code Action docs; GitHub blog "Copilot for PRs" 2025; CodeRabbit case studies).
신뢰도 A.
🕓 Changelog
날짜
변경
2026-05-08
Phase 1
2026-05-10
Manual cleanup — Claude Code Action + cached rubric + severity gating patterns 추가