2nd/10_Wiki/Topics/AI_and_ML/AI 생성 코드 검증(AI Code Assurance).md

---
id: wiki-2026-0508-ai-생성-코드-검증-ai-code-assurance
title: AI Code Assurance (AI 생성 코드 검증)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [AI Code Assurance, AI 생성 코드 검증, generated code review, vibe coding QA]
duplicate_of: none
source_trust_level: B
confidence_score: 0.85
verification_status: conceptual
tags: [ai-code-quality, sast, code-review, generated-code, devsecops, copilot-review, hallucination-detection]
raw_sources: []
last_reinforced: 2026-05-09
github_commit: pending
inferred_by: Claude Opus 4.7 (manual cleanup 2026-05-09)
---

# AI Code Assurance (AI 생성 코드 검증)

## 📌 한 줄 통찰
> **AI-generated code 의 inconsistent quality + hallucinated API + 매 unique vulnerability**. 매 PR 의 SAST + LLM-as-judge + human review 의 hybrid. **Vibe coding 의 trust ≠ verify**.

## 📖 핵심

### 매 AI generated code 의 risk

#### 1. Inconsistent style
- 매 prompt 의 different output.
- 매 codebase convention 의 ignore.
- 매 mix of pattern.

#### 2. Hallucinated API
- 매 non-existent function.
- 매 deprecated API.
- 매 wrong package version.

#### 3. Security vulnerability
- 매 CWE / OWASP pattern.
- 매 outdated security practice.
- 매 prompt injection 의 reproduce.

#### 4. Subtle bug
- 매 off-by-one.
- 매 race condition.
- 매 null check 의 miss.

#### 5. Over-engineered
- 매 unnecessary abstraction.
- 매 boilerplate.

#### 6. Under-tested
- 매 happy path 만.
- 매 edge case 의 miss.

### 매 verification layer

#### Layer 1: Compile / type check
- 매 TypeScript / Rust / Go 의 strict.
- 매 hallucination 의 catch.

#### Layer 2: Lint
- 매 style 의 enforce.
- 매 ESLint / clippy / Pylint.

#### Layer 3: SAST
- 매 security pattern.
- Snyk / Semgrep / Sonar.

#### Layer 4: Test
- 매 unit / integration.
- 매 generated code 의 coverage.

#### Layer 5: AI review (CodeRabbit)
- 매 PR 의 first-pass.
- 매 hallucination 의 detect.

#### Layer 6: Human review
- 매 logic / architecture.
- 매 critical path.

#### Layer 7: Production monitoring
- 매 error rate.
- 매 anomaly.

→ 매 layer 의 different defect class.

### Quality gate

#### Pre-commit
- Type check + lint + format.
- 매 dev 의 local.

#### CI / PR
- Test pass.
- SAST clean.
- AI review approved.
- Coverage threshold.

#### Pre-deploy
- Integration test.
- Performance regression.
- Security scan.

#### Post-deploy
- 매 alert / SLO.
- Rollback plan.

### 매 specific check

#### Hallucination detection
- 매 import 의 actual existence.
- 매 function signature 의 real.
- 매 documentation 의 cross-reference.

```python
import ast
import importlib

def check_imports(code: str):
    tree = ast.parse(code)
    for node in ast.walk(tree):
        if isinstance(node, ast.Import):
            for alias in node.names:
                try:
                    importlib.import_module(alias.name)
                except ImportError:
                    print(f"Hallucinated import: {alias.name}")
```

#### Security pattern
- SQL injection (string concat).
- XSS (HTML construction).
- Hardcoded secret.
- Unsafe deserialize.
- Prompt injection (LLM call concatenation).

#### Test coverage
- Required coverage threshold (80%+ for new code).
- 매 generated code 의 test 도 generated → 매 review.

### 매 organizational pattern

#### "AI-generated 의 명시"
- PR description 의 disclose.
- Commit message 의 tag.

#### Stricter review
- 매 AI-generated PR 의 매 senior review.
- 매 logic 의 deep verify.

#### Snippet 의 attribution
- Copilot 의 license / source.
- 매 code 의 origin track.

#### Prompts as code
- 매 prompt 의 git commit.
- 매 reproducibility.

### 매 metric (DORA-like)
- AI-generated 의 PR 의 % .
- AI-suggestion 의 accept rate.
- AI 의 bug 의 production escape.
- 매 reviewer 의 time-to-review.

## 💻 Code

### CI workflow (GitHub Actions)
```yaml
# .github/workflows/ai-code-check.yml
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Layer 1: type
      - run: npm run typecheck

      # Layer 2: lint
      - run: npm run lint

      # Layer 3: security
      - uses: snyk/actions/setup@master
      - run: snyk code test

      # Layer 4: test
      - run: npm test -- --coverage
      - uses: codecov/codecov-action@v3

      # Layer 5: AI review (CodeRabbit auto-runs)

      # Quality gate
      - run: |
          if [[ $(jq -r '.coverage' coverage.json) -lt 80 ]]; then
            exit 1
          fi
```

### Hallucination check (TS / npm)
```ts
import { execSync } from 'child_process';
import * as ts from 'typescript';

function checkImports(filePath: string) {
  const program = ts.createProgram([filePath], {});
  const sourceFile = program.getSourceFile(filePath);
  const issues: string[] = [];

  ts.forEachChild(sourceFile!, (node) => {
    if (ts.isImportDeclaration(node)) {
      const moduleName = (node.moduleSpecifier as ts.StringLiteral).text;
      try {
        require.resolve(moduleName, { paths: [process.cwd()] });
      } catch {
        issues.push(`Hallucinated: ${moduleName}`);
      }
    }
  });

  return issues;
}
```

### LLM-as-judge (verify generated code)
```python
def verify_generated(code: str, intent: str) -> dict:
    prompt = f"""
You are a code reviewer. Verify the AI-generated code.

Intent: {intent}

Code:
```
{code}
```

Check:
1. Does it match intent?
2. Any hallucinated API/import?
3. Security issues?
4. Edge cases missing?
5. Style consistent?

Output JSON: {{"matches_intent": bool, "issues": [{{...}}]}}
"""
    return json.loads(judge_llm.complete(prompt))
```

### Disclosure 의 PR template
```markdown
## AI-Generated Code Disclosure

This PR includes AI-generated code from:
- [ ] Cursor
- [ ] Claude Code
- [ ] Copilot
- [ ] Other: ___

Tools used:
- Prompts available at: [link]

I have reviewed:
- [ ] Each generated section.
- [ ] Tests pass + coverage.
- [ ] No hallucinated APIs.
- [ ] Security implications.
```

## 🤔 결정 기준

| AI-generated portion | Review level |
|---|---|
| < 20% | Standard |
| 20-50% | Enhanced (senior review) |
| > 50% | Strict (multiple reviewer) |
| Critical path | Always strict |
| Generated test | Verify edge cases |

**기본값**: Type + lint + SAST + test + AI review + human review. 매 AI-heavy PR 의 enhanced.

## 🔗 Graph
- 부모: [[AI_코드_리뷰]] · [[CI/CD Pipeline & IDE Security Integration|DevSecOps]] · [[Code-Quality]]
- 변형: [[SAST]] · [[LLM-as-Judge]]
- 응용: [[CodeRabbit]] · [[Snyk-Code]] · [[Sonar]]
- Adjacent: [[Code Agent — Devin / Cursor / Claude Code]]

## 🤖 LLM 활용
**언제**: 매 team 의 AI tool 의 adoption + quality.
**언제 X**: 매 individual hobby project. 매 throwaway script.

## ❌ 안티패턴
- **AI-generated + skip review**: production bug.
- **No disclosure**: hidden risk.
- **AI 의 own test 의 trust**: 매 same blind spot.
- **Hallucinated API 의 ship**: runtime error.
- **AI 의 single-layer 의 verify**: 매 defect class miss.

## 🧪 검증 / 중복
- Verified (concept).
- 신뢰도 B.
- Related: [[AI_코드_리뷰]], [[AI-Powered Code Analysis Tools]].

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-09 | Manual cleanup — 7 layer + code + 결정 + disclosure |