Files
2nd/10_Wiki/Topics/AI_and_ML/AI 생성 코드 검증(AI Code Assurance).md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

7.3 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, inferred_by
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit inferred_by
wiki-2026-0508-ai-생성-코드-검증-ai-code-assurance AI Code Assurance (AI 생성 코드 검증) 10_Wiki/Topics verified self
AI Code Assurance
AI 생성 코드 검증
generated code review
vibe coding QA
none B 0.85 conceptual
ai-code-quality
sast
code-review
generated-code
devsecops
copilot-review
hallucination-detection
2026-05-09 pending Claude Opus 4.7 (manual cleanup 2026-05-09)

AI Code Assurance (AI 생성 코드 검증)

📌 한 줄 통찰

AI-generated code 의 inconsistent quality + hallucinated API + 매 unique vulnerability. 매 PR 의 SAST + LLM-as-judge + human review 의 hybrid. Vibe coding 의 trust ≠ verify.

📖 핵심

매 AI generated code 의 risk

1. Inconsistent style

  • 매 prompt 의 different output.
  • 매 codebase convention 의 ignore.
  • 매 mix of pattern.

2. Hallucinated API

  • 매 non-existent function.
  • 매 deprecated API.
  • 매 wrong package version.

3. Security vulnerability

  • 매 CWE / OWASP pattern.
  • 매 outdated security practice.
  • 매 prompt injection 의 reproduce.

4. Subtle bug

  • 매 off-by-one.
  • 매 race condition.
  • 매 null check 의 miss.

5. Over-engineered

  • 매 unnecessary abstraction.
  • 매 boilerplate.

6. Under-tested

  • 매 happy path 만.
  • 매 edge case 의 miss.

매 verification layer

Layer 1: Compile / type check

  • 매 TypeScript / Rust / Go 의 strict.
  • 매 hallucination 의 catch.

Layer 2: Lint

  • 매 style 의 enforce.
  • 매 ESLint / clippy / Pylint.

Layer 3: SAST

  • 매 security pattern.
  • Snyk / Semgrep / Sonar.

Layer 4: Test

  • 매 unit / integration.
  • 매 generated code 의 coverage.

Layer 5: AI review (CodeRabbit)

  • 매 PR 의 first-pass.
  • 매 hallucination 의 detect.

Layer 6: Human review

  • 매 logic / architecture.
  • 매 critical path.

Layer 7: Production monitoring

  • 매 error rate.
  • 매 anomaly.

→ 매 layer 의 different defect class.

Quality gate

Pre-commit

  • Type check + lint + format.
  • 매 dev 의 local.

CI / PR

  • Test pass.
  • SAST clean.
  • AI review approved.
  • Coverage threshold.

Pre-deploy

  • Integration test.
  • Performance regression.
  • Security scan.

Post-deploy

  • 매 alert / SLO.
  • Rollback plan.

매 specific check

Hallucination detection

  • 매 import 의 actual existence.
  • 매 function signature 의 real.
  • 매 documentation 의 cross-reference.
import ast
import importlib

def check_imports(code: str):
    tree = ast.parse(code)
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                try:
                    importlib.import_module(alias.name)
                except ImportError:
                    print(f"Hallucinated import: {alias.name}")

Security pattern

  • SQL injection (string concat).
  • XSS (HTML construction).
  • Hardcoded secret.
  • Unsafe deserialize.
  • Prompt injection (LLM call concatenation).

Test coverage

  • Required coverage threshold (80%+ for new code).
  • 매 generated code 의 test 도 generated → 매 review.

매 organizational pattern

"AI-generated 의 명시"

  • PR description 의 disclose.
  • Commit message 의 tag.

Stricter review

  • 매 AI-generated PR 의 매 senior review.
  • 매 logic 의 deep verify.

Snippet 의 attribution

  • Copilot 의 license / source.
  • 매 code 의 origin track.

Prompts as code

  • 매 prompt 의 git commit.
  • 매 reproducibility.

매 metric (DORA-like)

  • AI-generated 의 PR 의 % .
  • AI-suggestion 의 accept rate.
  • AI 의 bug 의 production escape.
  • 매 reviewer 의 time-to-review.

💻 Code

CI workflow (GitHub Actions)

# .github/workflows/ai-code-check.yml
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      # Layer 1: type
      - run: npm run typecheck
      
      # Layer 2: lint
      - run: npm run lint
      
      # Layer 3: security
      - uses: snyk/actions/setup@master
      - run: snyk code test
      
      # Layer 4: test
      - run: npm test -- --coverage
      - uses: codecov/codecov-action@v3
      
      # Layer 5: AI review (CodeRabbit auto-runs)
      
      # Quality gate
      - run: |
          if [[ $(jq -r '.coverage' coverage.json) -lt 80 ]]; then
            exit 1
          fi

Hallucination check (TS / npm)

import { execSync } from 'child_process';
import * as ts from 'typescript';

function checkImports(filePath: string) {
  const program = ts.createProgram([filePath], {});
  const sourceFile = program.getSourceFile(filePath);
  const issues: string[] = [];
  
  ts.forEachChild(sourceFile!, (node) => {
    if (ts.isImportDeclaration(node)) {
      const moduleName = (node.moduleSpecifier as ts.StringLiteral).text;
      try {
        require.resolve(moduleName, { paths: [process.cwd()] });
      } catch {
        issues.push(`Hallucinated: ${moduleName}`);
      }
    }
  });
  
  return issues;
}

LLM-as-judge (verify generated code)

def verify_generated(code: str, intent: str) -> dict:
    prompt = f"""
You are a code reviewer. Verify the AI-generated code.

Intent: {intent}

Code:

{code}


Check:
1. Does it match intent?
2. Any hallucinated API/import?
3. Security issues?
4. Edge cases missing?
5. Style consistent?

Output JSON: {{"matches_intent": bool, "issues": [{{...}}]}}
"""
    return json.loads(judge_llm.complete(prompt))

Disclosure 의 PR template

## AI-Generated Code Disclosure

This PR includes AI-generated code from:
- [ ] Cursor
- [ ] Claude Code
- [ ] Copilot
- [ ] Other: ___

Tools used:
- Prompts available at: [link]

I have reviewed:
- [ ] Each generated section.
- [ ] Tests pass + coverage.
- [ ] No hallucinated APIs.
- [ ] Security implications.

🤔 결정 기준

AI-generated portion Review level
< 20% Standard
20-50% Enhanced (senior review)
> 50% Strict (multiple reviewer)
Critical path Always strict
Generated test Verify edge cases

기본값: Type + lint + SAST + test + AI review + human review. 매 AI-heavy PR 의 enhanced.

🔗 Graph

🤖 LLM 활용

언제: 매 team 의 AI tool 의 adoption + quality. 언제 X: 매 individual hobby project. 매 throwaway script.

안티패턴

  • AI-generated + skip review: production bug.
  • No disclosure: hidden risk.
  • AI 의 own test 의 trust: 매 same blind spot.
  • Hallucinated API 의 ship: runtime error.
  • AI 의 single-layer 의 verify: 매 defect class miss.

🧪 검증 / 중복

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-09 Manual cleanup — 7 layer + code + 결정 + disclosure