2nd/10_Wiki/Topics/Architecture/Predictive_Refactoring.md

---
id: wiki-2026-0508-predictive-refactoring
title: Predictive Refactoring
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [AI-predicted refactoring, refactoring suggestion, code smell prediction]
duplicate_of: none
source_trust_level: A
confidence_score: 0.85
verification_status: applied
tags: [refactoring, ai, code-quality, llm, static-analysis]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: typescript
  framework: claude-code-sdk
---

# Predictive Refactoring

## 매 한 줄
> **"매 LLM + 매 static-analysis + 매 git-history 를 결합해 '여기를 곧 고쳐야 한다'를 예측하고 매 PR 으로 제안"**. 매 reactive code review 가 매 proactive 로 이동. 매 2026 의 Claude Opus 4.7 / GPT-5 + 매 codebase RAG 의 핵심 use-case.

## 매 핵심

### 매 어디까지가 PR 인가
- **Reactive**: 매 code 작성 후 IDE 의 quick-fix.
- **Predictive**: 매 commit 패턴 / churn / 매 complexity drift 의 분석으로 매 미래 hotspot 예측 → 매 사전 PR.
- **Continuous**: 매 main 에 push 마다 매 background agent 가 분석 → 매 weekly digest.

### 매 신호
1. **Code churn**: 매 같은 file 의 매 commit 빈도.
2. **Complexity drift**: 매 cyclomatic / cognitive complexity 가 매 임계 초과 향하는 추세.
3. **Test coverage erosion**: 매 module의 매 coverage 하락.
4. **Code-smell ML model**: 매 long method, large class, feature envy 의 학습된 detector.
5. **Issue / bug correlation**: 매 bug 가 자주 나는 file.
6. **AST embedding similarity drift**: 매 module 이 매 codebase 의 다른 module 패턴에서 멀어짐.

### 매 ML / LLM 결합
- **Static feature extraction** (tree-sitter AST) → 매 vector.
- **History feature** (git log) → 매 churn time-series.
- **LLM** 매 candidate refactoring 생성 + 매 risk 평가 + 매 explanation.
- **Validation** 매 test run + 매 type-check + 매 mutation testing.

### 매 trust gate
- 매 LLM 제안의 자동 merge X.
- 매 small / mechanical (rename, extract) — 매 high-confidence auto-PR.
- 매 architectural (split module, change interface) — 매 RFC 형식 review request.

## 💻 패턴

### Hotspot detection (churn × complexity)
```ts
type FileMetric = { path: string; churn30d: number; cyclomatic: number; bugs90d: number };

function rankHotspots(metrics: FileMetric[]) {
  const max = (k: keyof FileMetric) =>
    Math.max(...metrics.map(m => m[k] as number)) || 1;
  const Mc = max("churn30d"), Mx = max("cyclomatic"), Mb = max("bugs90d");
  return metrics
    .map(m => ({
      ...m,
      score: (m.churn30d / Mc) * 0.4 +
             (m.cyclomatic / Mx) * 0.4 +
             (m.bugs90d / Mb) * 0.2,
    }))
    .sort((a, b) => b.score - a.score)
    .slice(0, 20);
}
```

### LLM refactoring suggestion (Claude Code SDK)
```ts
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();

async function suggestRefactor(filePath: string, source: string) {
  const res = await client.messages.create({
    model: "claude-opus-4-7",
    max_tokens: 4096,
    system: [{
      type: "text",
      text: "매 senior staff engineer. 매 unsafe / large refactor 의 reject. 매 small testable steps.",
      cache_control: { type: "ephemeral" },
    }],
    messages: [{
      role: "user",
      content: `파일: ${filePath}\n\n\`\`\`\n${source}\n\`\`\`\n\n` +
               "매 1) 가장 시급한 refactoring 1개 + 매 2) diff (unified) + 매 3) risk 평가 + 매 4) test 추가 제안.",
    }],
  });
  return parseStructured(res);
}
```

### Auto-PR generation (mechanical refactor)
```ts
async function autoRefactorPR(repo: string, suggestion: Suggestion) {
  if (suggestion.risk !== "low" || suggestion.kind !== "mechanical") return;

  const branch = `refactor/predict-${suggestion.id}`;
  await git.createBranch(repo, branch);
  await git.applyDiff(repo, branch, suggestion.diff);

  const tests = await ci.run(repo, branch);
  if (!tests.green) return; // 매 abort

  await github.openPR(repo, {
    base: "main", head: branch,
    title: `refactor: ${suggestion.title}`,
    body: prTemplate(suggestion),
    labels: ["predictive-refactor", "auto"],
    reviewers: [suggestion.codeOwner],
  });
}
```

### Risk classifier
```ts
function classifyRisk(s: Suggestion): "low" | "med" | "high" {
  if (s.linesChanged > 200 || s.filesTouched > 5) return "high";
  if (s.touchesPublicAPI || s.touchesMigration) return "high";
  if (s.changesBehavior) return "med";
  return "low"; // 매 rename, extract local fn, dead code, type tightening
}
```

### Continuous monitor (GitHub Action)
```yaml
name: Predictive Refactoring
on:
  schedule: [{ cron: "0 4 * * 1" }]   # 매 주 월요일 새벽
  workflow_dispatch: {}
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - run: npx tree-sitter parse . > ast.json
      - run: node scripts/hotspots.js > hotspots.json
      - uses: anthropics/claude-code-action@v1
        with:
          mode: refactor-digest
          input: hotspots.json
          api-key: ${{ secrets.ANTHROPIC_API_KEY }}
      - uses: peter-evans/create-pull-request@v6
        with:
          title: "weekly: 매 predictive refactoring digest"
          body-path: digest.md
```

### Mutation-test gated merge
```bash
# 매 refactor 가 behavior 보존인지 확인
npx stryker run --reporters dashboard,json
# 매 mutation score ≥ pre-refactor 일 때만 merge 허용
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| Mechanical (rename, extract) | LLM auto-PR + CI green |
| Logic 보존 needed | Mutation testing gate |
| API change | RFC + human review |
| Cross-module | RFC + ADR + staged plan |
| Legacy module no tests | 매 test scaffold 먼저, refactor 후 |
| Hotspot 미발견 | churn/complexity threshold 낮춤 |

**기본값**: 매 weekly digest + 매 low-risk auto-PR + 매 high-risk RFC.

## 🔗 Graph
- 부모: [[Refactoring_Best_Practices|Refactoring]]
- 변형: [[Automated Refactoring Tools]]
- 응용: [[Claude Code]]
- Adjacent: [[Code Smells]] · [[Code Churn]] · [[Mutation Testing]] · [[Static Analysis]]

## 🤖 LLM 활용
**언제**: 매 codebase >50k LoC, 매 ongoing maintenance, 매 hot-spot 의 패턴화 가능.
**언제 X**: 매 prototype, 매 throwaway, 매 매우 작은 codebase — 매 overhead.

## ❌ 안티패턴
- **자동 merge of LLM diff**: 매 hallucination / behavior break. → 매 항상 CI + 매 PR review.
- **Refactor without tests**: 매 mutation/regression 무방어.
- **Big-bang refactor**: 매 small steps 무시.
- **Score 만 보고 hotspot 따라가기**: 매 churn 이 곧 file 의 healthy domain change 일 수도.
- **No code-owner routing**: 매 PR floods.

## 🧪 검증 / 중복
- Verified (Microsoft Research churn-bug correlation, "Your Code as a Crime Scene" Adam Tornhill, Anthropic Claude Code SDK docs, Stryker mutation testing).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — hotspot signals + LLM-assisted PR pipeline |