Files
2nd/10_Wiki/Topics/AI_and_ML/LLM-based_Code_Analysis.md
T
2026-05-10 22:08:15 +09:00

4.9 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-llm-based-code-analysis LLM-based Code Analysis 10_Wiki/Topics verified self
AI Code Review
LLM Code Review
AI-augmented Static Analysis
none A 0.9 applied
llm
code-review
static-analysis
ai-tooling
devx
2026-05-10 pending
language framework
any claude/gpt/cursor/cody

LLM-based Code Analysis

매 한 줄

"매 LLM 은 의도 (intent) 를 본다". AST 는 syntax, LLM 은 semantics 와 naming, 두 layer 를 합쳐야 진짜 review 가 된다.

매 핵심

매 두 layer

  • Deterministic (AST/SAST): ESLint, Semgrep, CodeQL — taint, null, type
  • Probabilistic (LLM): Claude/GPT — naming, design, "이 함수 왜 존재?", architectural smell
  • 둘은 보완. LLM 만으로는 false-positive 폭발, AST 만으로는 의도 못 봄

매 응용

  1. PR review bot: diff → LLM → 댓글
  2. Refactor suggestions: "이 함수 분리해야" 제안
  3. Code search semantic: Sourcegraph Cody, "auth 검증하는 곳" 자연어 검색
  4. Doc generation: 함수 → docstring 자동
  5. Bug hunt: "이 코드에 race condition 있나?"

💻 패턴

Pattern 1: PR review with Claude

# .github/workflows/claude-review.yml trigger
import anthropic, os
from github import Github

def review_pr(pr_number):
    gh = Github(os.environ["GH_TOKEN"])
    pr = gh.get_repo(os.environ["REPO"]).get_pull(pr_number)
    diff = pr.get_files()
    diff_text = "\n".join(f"{f.filename}\n{f.patch}" for f in diff if f.patch)

    client = anthropic.Anthropic()
    msg = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=2000,
        system="You are a senior reviewer. Comment only on real issues. Skip nits.",
        messages=[{"role": "user", "content": f"Review this diff:\n{diff_text}"}],
    )
    pr.create_issue_comment(msg.content[0].text)

Pattern 2: AST + LLM hybrid

import ast

def find_long_functions(src):
    tree = ast.parse(src)
    return [n for n in ast.walk(tree)
            if isinstance(n, ast.FunctionDef) and (n.end_lineno - n.lineno) > 50]

# AST 가 후보 추림 → LLM 이 의도 분석
for fn in find_long_functions(open("app.py").read()):
    snippet = ast.get_source_segment(src, fn)
    ask_llm(f"Why is this function long? Should it be split?\n{snippet}")

Pattern 3: Cursor / Continue inline review

// .cursor/rules
{
  "review": {
    "trigger": "on_save",
    "prompt": "Flag: missing null check, magic number, leaky abstraction. Be terse."
  }
}
# CLI
cody chat "어디서 user session 검증하는지 찾아줘"
# → ranks files by semantic match, not grep

Pattern 5: Cost guard for LLM review

# 큰 PR 은 file-by-file, small 은 한번에
def chunk_strategy(diff_lines):
    if diff_lines < 200: return "single"
    if diff_lines < 1000: return "per_file"
    return "summary_only"  # 대형 PR 은 high-level summary 만

Pattern 6: Prompt for naming smell

You are reviewing variable/function names. Flag ONLY:
- Unclear (data, info, tmp, x)
- Lying (getUser that mutates)
- Inconsistent with rest of codebase
Output JSON: [{file, line, suggestion}]

Pattern 7: Reject auto-merge if LLM finds blocker

- name: LLM gate
  run: python review.py --severity-threshold blocker
  # exit 1 if any "blocker" found

매 결정 기준

상황 Approach
Type/null/taint 검출 AST/SAST (deterministic)
Design / naming / intent LLM
둘 다 필요 Hybrid (AST 후보 → LLM 분석)
큰 PR (>1k line) Summary only, per-file 비용 폭발
Security critical CodeQL primary, LLM secondary

기본값: Semgrep + Claude review bot, blocker 만 PR 차단.

🔗 Graph

🤖 LLM 활용

언제: 의도/설계 review, naming, refactor 제안, 자연어 코드 검색. 언제 X: 보안 critical (CodeQL/Semgrep 우선), 결정론적 검증 (type checker), hot path latency.

안티패턴

  • LLM 출력 100% 신뢰 → false-positive 폭주, 리뷰어 피로
  • AST 없이 LLM 만 → 비용 폭발, deterministic check 누락
  • "Nit" 까지 코멘트 → 신호 대 잡음 ↓
  • Diff 전체를 한 prompt 에 → context limit, 비용
  • Public repo 에 unredacted secret 포함 코드 LLM 전송

🧪 검증 / 중복

  • Verified (Anthropic Claude API, Cursor docs, Sourcegraph Cody, Semgrep). 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — hybrid AST+LLM, PR review bot patterns