--- id: wiki-2026-0508-predictive-refactoring title: Predictive Refactoring category: 10_Wiki/Topics status: verified canonical_id: self aliases: [AI-predicted refactoring, refactoring suggestion, code smell prediction] duplicate_of: none source_trust_level: A confidence_score: 0.85 verification_status: applied tags: [refactoring, ai, code-quality, llm, static-analysis] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: typescript framework: claude-code-sdk --- # Predictive Refactoring ## 매 한 줄 > **"매 LLM + 매 static-analysis + 매 git-history 를 결합해 '여기를 곧 고쳐야 한다'를 예측하고 매 PR 으로 제안"**. 매 reactive code review 가 매 proactive 로 이동. 매 2026 의 Claude Opus 4.7 / GPT-5 + 매 codebase RAG 의 핵심 use-case. ## 매 핵심 ### 매 어디까지가 PR 인가 - **Reactive**: 매 code 작성 후 IDE 의 quick-fix. - **Predictive**: 매 commit 패턴 / churn / 매 complexity drift 의 분석으로 매 미래 hotspot 예측 → 매 사전 PR. - **Continuous**: 매 main 에 push 마다 매 background agent 가 분석 → 매 weekly digest. ### 매 신호 1. **Code churn**: 매 같은 file 의 매 commit 빈도. 2. **Complexity drift**: 매 cyclomatic / cognitive complexity 가 매 임계 초과 향하는 추세. 3. **Test coverage erosion**: 매 module의 매 coverage 하락. 4. **Code-smell ML model**: 매 long method, large class, feature envy 의 학습된 detector. 5. **Issue / bug correlation**: 매 bug 가 자주 나는 file. 6. **AST embedding similarity drift**: 매 module 이 매 codebase 의 다른 module 패턴에서 멀어짐. ### 매 ML / LLM 결합 - **Static feature extraction** (tree-sitter AST) → 매 vector. - **History feature** (git log) → 매 churn time-series. - **LLM** 매 candidate refactoring 생성 + 매 risk 평가 + 매 explanation. - **Validation** 매 test run + 매 type-check + 매 mutation testing. ### 매 trust gate - 매 LLM 제안의 자동 merge X. - 매 small / mechanical (rename, extract) — 매 high-confidence auto-PR. - 매 architectural (split module, change interface) — 매 RFC 형식 review request. ## 💻 패턴 ### Hotspot detection (churn × complexity) ```ts type FileMetric = { path: string; churn30d: number; cyclomatic: number; bugs90d: number }; function rankHotspots(metrics: FileMetric[]) { const max = (k: keyof FileMetric) => Math.max(...metrics.map(m => m[k] as number)) || 1; const Mc = max("churn30d"), Mx = max("cyclomatic"), Mb = max("bugs90d"); return metrics .map(m => ({ ...m, score: (m.churn30d / Mc) * 0.4 + (m.cyclomatic / Mx) * 0.4 + (m.bugs90d / Mb) * 0.2, })) .sort((a, b) => b.score - a.score) .slice(0, 20); } ``` ### LLM refactoring suggestion (Claude Code SDK) ```ts import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic(); async function suggestRefactor(filePath: string, source: string) { const res = await client.messages.create({ model: "claude-opus-4-7", max_tokens: 4096, system: [{ type: "text", text: "매 senior staff engineer. 매 unsafe / large refactor 의 reject. 매 small testable steps.", cache_control: { type: "ephemeral" }, }], messages: [{ role: "user", content: `파일: ${filePath}\n\n\`\`\`\n${source}\n\`\`\`\n\n` + "매 1) 가장 시급한 refactoring 1개 + 매 2) diff (unified) + 매 3) risk 평가 + 매 4) test 추가 제안.", }], }); return parseStructured(res); } ``` ### Auto-PR generation (mechanical refactor) ```ts async function autoRefactorPR(repo: string, suggestion: Suggestion) { if (suggestion.risk !== "low" || suggestion.kind !== "mechanical") return; const branch = `refactor/predict-${suggestion.id}`; await git.createBranch(repo, branch); await git.applyDiff(repo, branch, suggestion.diff); const tests = await ci.run(repo, branch); if (!tests.green) return; // 매 abort await github.openPR(repo, { base: "main", head: branch, title: `refactor: ${suggestion.title}`, body: prTemplate(suggestion), labels: ["predictive-refactor", "auto"], reviewers: [suggestion.codeOwner], }); } ``` ### Risk classifier ```ts function classifyRisk(s: Suggestion): "low" | "med" | "high" { if (s.linesChanged > 200 || s.filesTouched > 5) return "high"; if (s.touchesPublicAPI || s.touchesMigration) return "high"; if (s.changesBehavior) return "med"; return "low"; // 매 rename, extract local fn, dead code, type tightening } ``` ### Continuous monitor (GitHub Action) ```yaml name: Predictive Refactoring on: schedule: [{ cron: "0 4 * * 1" }] # 매 주 월요일 새벽 workflow_dispatch: {} jobs: scan: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: { fetch-depth: 0 } - run: npx tree-sitter parse . > ast.json - run: node scripts/hotspots.js > hotspots.json - uses: anthropics/claude-code-action@v1 with: mode: refactor-digest input: hotspots.json api-key: ${{ secrets.ANTHROPIC_API_KEY }} - uses: peter-evans/create-pull-request@v6 with: title: "weekly: 매 predictive refactoring digest" body-path: digest.md ``` ### Mutation-test gated merge ```bash # 매 refactor 가 behavior 보존인지 확인 npx stryker run --reporters dashboard,json # 매 mutation score ≥ pre-refactor 일 때만 merge 허용 ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Mechanical (rename, extract) | LLM auto-PR + CI green | | Logic 보존 needed | Mutation testing gate | | API change | RFC + human review | | Cross-module | RFC + ADR + staged plan | | Legacy module no tests | 매 test scaffold 먼저, refactor 후 | | Hotspot 미발견 | churn/complexity threshold 낮춤 | **기본값**: 매 weekly digest + 매 low-risk auto-PR + 매 high-risk RFC. ## 🔗 Graph - 부모: [[Refactoring_Best_Practices|Refactoring]] - 변형: [[Automated Refactoring Tools]] - 응용: [[Claude Code]] - Adjacent: [[Code Smells]] · [[Code Churn]] · [[Mutation Testing]] · [[Static Analysis]] ## 🤖 LLM 활용 **언제**: 매 codebase >50k LoC, 매 ongoing maintenance, 매 hot-spot 의 패턴화 가능. **언제 X**: 매 prototype, 매 throwaway, 매 매우 작은 codebase — 매 overhead. ## ❌ 안티패턴 - **자동 merge of LLM diff**: 매 hallucination / behavior break. → 매 항상 CI + 매 PR review. - **Refactor without tests**: 매 mutation/regression 무방어. - **Big-bang refactor**: 매 small steps 무시. - **Score 만 보고 hotspot 따라가기**: 매 churn 이 곧 file 의 healthy domain change 일 수도. - **No code-owner routing**: 매 PR floods. ## 🧪 검증 / 중복 - Verified (Microsoft Research churn-bug correlation, "Your Code as a Crime Scene" Adam Tornhill, Anthropic Claude Code SDK docs, Stryker mutation testing). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — hotspot signals + LLM-assisted PR pipeline |