Files
2nd/10_Wiki/Topics/AI_and_ML/AI 생성 코드 검증(AI Code Assurance).md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

313 lines
7.4 KiB
Markdown

---
id: wiki-2026-0508-ai-생성-코드-검증-ai-code-assurance
title: AI Code Assurance (AI 생성 코드 검증)
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [AI Code Assurance, AI 생성 코드 검증, generated code review, vibe coding QA]
duplicate_of: none
source_trust_level: B
confidence_score: 0.85
verification_status: conceptual
tags: [ai-code-quality, sast, code-review, generated-code, devsecops, copilot-review, hallucination-detection]
raw_sources: []
last_reinforced: 2026-05-09
github_commit: pending
inferred_by: Claude Opus 4.7 (manual cleanup 2026-05-09)
---
# AI Code Assurance (AI 생성 코드 검증)
## 📌 한 줄 통찰
> **AI-generated code 의 inconsistent quality + hallucinated API + 매 unique vulnerability**. 매 PR 의 SAST + LLM-as-judge + human review 의 hybrid. **Vibe coding 의 trust ≠ verify**.
## 📖 핵심
### 매 AI generated code 의 risk
#### 1. Inconsistent style
- 매 prompt 의 different output.
- 매 codebase convention 의 ignore.
- 매 mix of pattern.
#### 2. Hallucinated API
- 매 non-existent function.
- 매 deprecated API.
- 매 wrong package version.
#### 3. Security vulnerability
- 매 CWE / OWASP pattern.
- 매 outdated security practice.
- 매 prompt injection 의 reproduce.
#### 4. Subtle bug
- 매 off-by-one.
- 매 race condition.
- 매 null check 의 miss.
#### 5. Over-engineered
- 매 unnecessary abstraction.
- 매 boilerplate.
#### 6. Under-tested
- 매 happy path 만.
- 매 edge case 의 miss.
### 매 verification layer
#### Layer 1: Compile / type check
- 매 TypeScript / Rust / Go 의 strict.
- 매 hallucination 의 catch.
#### Layer 2: Lint
- 매 style 의 enforce.
- 매 ESLint / clippy / Pylint.
#### Layer 3: SAST
- 매 security pattern.
- Snyk / Semgrep / Sonar.
#### Layer 4: Test
- 매 unit / integration.
- 매 generated code 의 coverage.
#### Layer 5: AI review (CodeRabbit)
- 매 PR 의 first-pass.
- 매 hallucination 의 detect.
#### Layer 6: Human review
- 매 logic / architecture.
- 매 critical path.
#### Layer 7: Production monitoring
- 매 error rate.
- 매 anomaly.
→ 매 layer 의 different defect class.
### Quality gate
#### Pre-commit
- Type check + lint + format.
- 매 dev 의 local.
#### CI / PR
- Test pass.
- SAST clean.
- AI review approved.
- Coverage threshold.
#### Pre-deploy
- Integration test.
- Performance regression.
- Security scan.
#### Post-deploy
- 매 alert / SLO.
- Rollback plan.
### 매 specific check
#### Hallucination detection
- 매 import 의 actual existence.
- 매 function signature 의 real.
- 매 documentation 의 cross-reference.
```python
import ast
import importlib
def check_imports(code: str):
tree = ast.parse(code)
for node in ast.walk(tree):
if isinstance(node, ast.Import):
for alias in node.names:
try:
importlib.import_module(alias.name)
except ImportError:
print(f"Hallucinated import: {alias.name}")
```
#### Security pattern
- SQL injection (string concat).
- XSS (HTML construction).
- Hardcoded secret.
- Unsafe deserialize.
- Prompt injection (LLM call concatenation).
#### Test coverage
- Required coverage threshold (80%+ for new code).
- 매 generated code 의 test 도 generated → 매 review.
### 매 organizational pattern
#### "AI-generated 의 명시"
- PR description 의 disclose.
- Commit message 의 tag.
#### Stricter review
- 매 AI-generated PR 의 매 senior review.
- 매 logic 의 deep verify.
#### Snippet 의 attribution
- Copilot 의 license / source.
- 매 code 의 origin track.
#### Prompts as code
- 매 prompt 의 git commit.
- 매 reproducibility.
### 매 metric (DORA-like)
- AI-generated 의 PR 의 % .
- AI-suggestion 의 accept rate.
- AI 의 bug 의 production escape.
- 매 reviewer 의 time-to-review.
## 💻 Code
### CI workflow (GitHub Actions)
```yaml
# .github/workflows/ai-code-check.yml
on:
pull_request:
types: [opened, synchronize]
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# Layer 1: type
- run: npm run typecheck
# Layer 2: lint
- run: npm run lint
# Layer 3: security
- uses: snyk/actions/setup@master
- run: snyk code test
# Layer 4: test
- run: npm test -- --coverage
- uses: codecov/codecov-action@v3
# Layer 5: AI review (CodeRabbit auto-runs)
# Quality gate
- run: |
if [[ $(jq -r '.coverage' coverage.json) -lt 80 ]]; then
exit 1
fi
```
### Hallucination check (TS / npm)
```ts
import { execSync } from 'child_process';
import * as ts from 'typescript';
function checkImports(filePath: string) {
const program = ts.createProgram([filePath], {});
const sourceFile = program.getSourceFile(filePath);
const issues: string[] = [];
ts.forEachChild(sourceFile!, (node) => {
if (ts.isImportDeclaration(node)) {
const moduleName = (node.moduleSpecifier as ts.StringLiteral).text;
try {
require.resolve(moduleName, { paths: [process.cwd()] });
} catch {
issues.push(`Hallucinated: ${moduleName}`);
}
}
});
return issues;
}
```
### LLM-as-judge (verify generated code)
```python
def verify_generated(code: str, intent: str) -> dict:
prompt = f"""
You are a code reviewer. Verify the AI-generated code.
Intent: {intent}
Code:
```
{code}
```
Check:
1. Does it match intent?
2. Any hallucinated API/import?
3. Security issues?
4. Edge cases missing?
5. Style consistent?
Output JSON: {{"matches_intent": bool, "issues": [{{...}}]}}
"""
return json.loads(judge_llm.complete(prompt))
```
### Disclosure 의 PR template
```markdown
## AI-Generated Code Disclosure
This PR includes AI-generated code from:
- [ ] Cursor
- [ ] Claude Code
- [ ] Copilot
- [ ] Other: ___
Tools used:
- Prompts available at: [link]
I have reviewed:
- [ ] Each generated section.
- [ ] Tests pass + coverage.
- [ ] No hallucinated APIs.
- [ ] Security implications.
```
## 🤔 결정 기준
| AI-generated portion | Review level |
|---|---|
| < 20% | Standard |
| 20-50% | Enhanced (senior review) |
| > 50% | Strict (multiple reviewer) |
| Critical path | Always strict |
| Generated test | Verify edge cases |
**기본값**: Type + lint + SAST + test + AI review + human review. 매 AI-heavy PR 의 enhanced.
## 🔗 Graph
- 부모: [[AI_코드_리뷰]] · [[CI_CD 파이프라인 및 IDE 통합 보안|DevSecOps]] · [[Code-Quality]]
- 변형: [[SAST]] · [[LLM-as-Judge]]
- 응용: [[CodeRabbit]] · [[Snyk-Code]] · [[Sonar]]
- Adjacent: [[AI-Code-Agent-Patterns]]
## 🤖 LLM 활용
**언제**: 매 team 의 AI tool 의 adoption + quality.
**언제 X**: 매 individual hobby project. 매 throwaway script.
## ❌ 안티패턴
- **AI-generated + skip review**: production bug.
- **No disclosure**: hidden risk.
- **AI 의 own test 의 trust**: 매 same blind spot.
- **Hallucinated API 의 ship**: runtime error.
- **AI 의 single-layer 의 verify**: 매 defect class miss.
## 🧪 검증 / 중복
- Verified (concept).
- 신뢰도 B.
- Related: [[AI_코드_리뷰]], [[AI 기반 코드 분석 도구 (AI-Powered Code Analysis Tools)]].
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-09 | Manual cleanup — 7 layer + code + 결정 + disclosure |