Files
2nd/10_Wiki/Topics/AI_and_ML/AI 생성 코드 검증(AI Code Assurance).md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

313 lines
7.3 KiB
Markdown

---
id: wiki-2026-0508-ai-생성-코드-검증-ai-code-assurance
title: AI Code Assurance (AI 생성 코드 검증)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [AI Code Assurance, AI 생성 코드 검증, generated code review, vibe coding QA]
duplicate_of: none
source_trust_level: B
confidence_score: 0.85
verification_status: conceptual
tags: [ai-code-quality, sast, code-review, generated-code, devsecops, copilot-review, hallucination-detection]
raw_sources: []
last_reinforced: 2026-05-09
github_commit: pending
inferred_by: Claude Opus 4.7 (manual cleanup 2026-05-09)
---
# AI Code Assurance (AI 생성 코드 검증)
## 📌 한 줄 통찰
> **AI-generated code 의 inconsistent quality + hallucinated API + 매 unique vulnerability**. 매 PR 의 SAST + LLM-as-judge + human review 의 hybrid. **Vibe coding 의 trust ≠ verify**.
## 📖 핵심
### 매 AI generated code 의 risk
#### 1. Inconsistent style
- 매 prompt 의 different output.
- 매 codebase convention 의 ignore.
- 매 mix of pattern.
#### 2. Hallucinated API
- 매 non-existent function.
- 매 deprecated API.
- 매 wrong package version.
#### 3. Security vulnerability
- 매 CWE / OWASP pattern.
- 매 outdated security practice.
- 매 prompt injection 의 reproduce.
#### 4. Subtle bug
- 매 off-by-one.
- 매 race condition.
- 매 null check 의 miss.
#### 5. Over-engineered
- 매 unnecessary abstraction.
- 매 boilerplate.
#### 6. Under-tested
- 매 happy path 만.
- 매 edge case 의 miss.
### 매 verification layer
#### Layer 1: Compile / type check
- 매 TypeScript / Rust / Go 의 strict.
- 매 hallucination 의 catch.
#### Layer 2: Lint
- 매 style 의 enforce.
- 매 ESLint / clippy / Pylint.
#### Layer 3: SAST
- 매 security pattern.
- Snyk / Semgrep / Sonar.
#### Layer 4: Test
- 매 unit / integration.
- 매 generated code 의 coverage.
#### Layer 5: AI review (CodeRabbit)
- 매 PR 의 first-pass.
- 매 hallucination 의 detect.
#### Layer 6: Human review
- 매 logic / architecture.
- 매 critical path.
#### Layer 7: Production monitoring
- 매 error rate.
- 매 anomaly.
→ 매 layer 의 different defect class.
### Quality gate
#### Pre-commit
- Type check + lint + format.
- 매 dev 의 local.
#### CI / PR
- Test pass.
- SAST clean.
- AI review approved.
- Coverage threshold.
#### Pre-deploy
- Integration test.
- Performance regression.
- Security scan.
#### Post-deploy
- 매 alert / SLO.
- Rollback plan.
### 매 specific check
#### Hallucination detection
- 매 import 의 actual existence.
- 매 function signature 의 real.
- 매 documentation 의 cross-reference.
```python
import ast
import importlib
def check_imports(code: str):
tree = ast.parse(code)
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
try:
importlib.import_module(alias.name)
except ImportError:
print(f"Hallucinated import: {alias.name}")
```
#### Security pattern
- SQL injection (string concat).
- XSS (HTML construction).
- Hardcoded secret.
- Unsafe deserialize.
- Prompt injection (LLM call concatenation).
#### Test coverage
- Required coverage threshold (80%+ for new code).
- 매 generated code 의 test 도 generated → 매 review.
### 매 organizational pattern
#### "AI-generated 의 명시"
- PR description 의 disclose.
- Commit message 의 tag.
#### Stricter review
- 매 AI-generated PR 의 매 senior review.
- 매 logic 의 deep verify.
#### Snippet 의 attribution
- Copilot 의 license / source.
- 매 code 의 origin track.
#### Prompts as code
- 매 prompt 의 git commit.
- 매 reproducibility.
### 매 metric (DORA-like)
- AI-generated 의 PR 의 % .
- AI-suggestion 의 accept rate.
- AI 의 bug 의 production escape.
- 매 reviewer 의 time-to-review.
## 💻 Code
### CI workflow (GitHub Actions)
```yaml
# .github/workflows/ai-code-check.yml
on:
pull_request:
types: [opened, synchronize]
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# Layer 1: type
- run: npm run typecheck
# Layer 2: lint
- run: npm run lint
# Layer 3: security
- uses: snyk/actions/setup@master
- run: snyk code test
# Layer 4: test
- run: npm test -- --coverage
- uses: codecov/codecov-action@v3
# Layer 5: AI review (CodeRabbit auto-runs)
# Quality gate
- run: |
if [[ $(jq -r '.coverage' coverage.json) -lt 80 ]]; then
exit 1
fi
```
### Hallucination check (TS / npm)
```ts
import { execSync } from 'child_process';
import * as ts from 'typescript';
function checkImports(filePath: string) {
const program = ts.createProgram([filePath], {});
const sourceFile = program.getSourceFile(filePath);
const issues: string[] = [];
ts.forEachChild(sourceFile!, (node) => {
if (ts.isImportDeclaration(node)) {
const moduleName = (node.moduleSpecifier as ts.StringLiteral).text;
try {
require.resolve(moduleName, { paths: [process.cwd()] });
} catch {
issues.push(`Hallucinated: ${moduleName}`);
}
}
});
return issues;
}
```
### LLM-as-judge (verify generated code)
```python
def verify_generated(code: str, intent: str) -> dict:
prompt = f"""
You are a code reviewer. Verify the AI-generated code.
Intent: {intent}
Code:
```
{code}
```
Check:
1. Does it match intent?
2. Any hallucinated API/import?
3. Security issues?
4. Edge cases missing?
5. Style consistent?
Output JSON: {{"matches_intent": bool, "issues": [{{...}}]}}
"""
return json.loads(judge_llm.complete(prompt))
```
### Disclosure 의 PR template
```markdown
## AI-Generated Code Disclosure
This PR includes AI-generated code from:
- [ ] Cursor
- [ ] Claude Code
- [ ] Copilot
- [ ] Other: ___
Tools used:
- Prompts available at: [link]
I have reviewed:
- [ ] Each generated section.
- [ ] Tests pass + coverage.
- [ ] No hallucinated APIs.
- [ ] Security implications.
```
## 🤔 결정 기준
| AI-generated portion | Review level |
|---|---|
| < 20% | Standard |
| 20-50% | Enhanced (senior review) |
| > 50% | Strict (multiple reviewer) |
| Critical path | Always strict |
| Generated test | Verify edge cases |
**기본값**: Type + lint + SAST + test + AI review + human review. 매 AI-heavy PR 의 enhanced.
## 🔗 Graph
- 부모: [[AI_코드_리뷰]] · [[CI/CD Pipeline & IDE Security Integration|DevSecOps]] · [[Code-Quality]]
- 변형: [[SAST]] · [[LLM-as-Judge]]
- 응용: [[CodeRabbit]] · [[Snyk-Code]] · [[Sonar]]
- Adjacent: [[Code Agent — Devin / Cursor / Claude Code]]
## 🤖 LLM 활용
**언제**: 매 team 의 AI tool 의 adoption + quality.
**언제 X**: 매 individual hobby project. 매 throwaway script.
## ❌ 안티패턴
- **AI-generated + skip review**: production bug.
- **No disclosure**: hidden risk.
- **AI 의 own test 의 trust**: 매 same blind spot.
- **Hallucinated API 의 ship**: runtime error.
- **AI 의 single-layer 의 verify**: 매 defect class miss.
## 🧪 검증 / 중복
- Verified (concept).
- 신뢰도 B.
- Related: [[AI_코드_리뷰]], [[AI-Powered Code Analysis Tools]].
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-09 | Manual cleanup — 7 layer + code + 결정 + disclosure |