Files
2nd/10_Wiki/Topics/DevOps_and_Security/Adversarial Code Stylometry.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

150 lines
4.7 KiB
Markdown

---
id: wiki-2026-0508-adversarial-code-stylometry
title: Adversarial Code Stylometry
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [P-Reinforce-AUTO-36585B, Code Authorship Obfuscation]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [security, ml, privacy, deanonymization]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: python
framework: scikit-learn
---
# Adversarial Code Stylometry
## 매 한 줄
> **"매 source code 의 author 를 statistical fingerprint 로 식별 — 그리고 attacker 는 이를 회피한다."**. Code stylometry 는 AST features + n-grams + lexical patterns 로 author 를 95% accuracy 로 deanonymize 가능; adversarial stylometry 는 transformation/obfuscation 으로 이를 무력화한다.
## 매 핵심
### 매 Feature Family
- **Lexical**: identifier length, naming convention, comment density.
- **Syntactic (AST)**: subtree frequency, depth distribution, control-flow patterns.
- **Layout**: indentation, brace style, line length.
- **Semantic**: API choice, idiom preference (list comp vs loop).
### 매 Attack Surface
- **Open-source contributors** — GitHub commits 의 deanonymization.
- **Malware authorship** — APT attribution.
- **Plagiarism detection** — academic/hiring context.
- **Bug bounty / leak** — anonymous reporter identification.
### 매 Defense
1. Code transformation (Caliskan 2018 — paraphrase preserving semantics).
2. LLM-mediated rewrite (rewrite via Claude/GPT to neutralize style).
3. Style transfer to another author (mimicry).
4. Mechanical normalization (autoformatter + identifier randomization).
## 💻 패턴
### AST Feature Extractor
```python
import ast
from collections import Counter
def ast_node_freq(source: str) -> Counter:
tree = ast.parse(source)
return Counter(type(n).__name__ for n in ast.walk(tree))
```
### Author Classifier (sklearn)
```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
pipe = Pipeline([
("tfidf", TfidfVectorizer(analyzer="char_wb", ngram_range=(2, 5))),
("rf", RandomForestClassifier(n_estimators=300, n_jobs=-1)),
])
pipe.fit(train_sources, train_authors)
```
### Style Obfuscation via Rewrite
```python
import anthropic
client = anthropic.Anthropic()
def neutralize_style(code: str) -> str:
msg = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
messages=[{"role": "user", "content": f"""Rewrite this code to neutralize authorial style.
Preserve semantics exactly. Use generic identifiers, standard idioms, mechanical formatting.
```python
{code}
```"""}],
)
return msg.content[0].text
```
### Mimicry Attack (target style)
```python
def mimic(code: str, target_samples: list[str]) -> str:
"""Rewrite `code` to look like `target_samples` author."""
target_blob = "\n---\n".join(target_samples[:3])
prompt = f"Target author samples:\n{target_blob}\n\nRewrite preserving semantics:\n{code}"
return llm_call(prompt)
```
### Detection of Obfuscated Code
```python
def obfuscation_signal(code: str) -> float:
"""High score → likely autoformatted/normalized."""
feats = ast_node_freq(code)
entropy = -sum((c/sum(feats.values())) * np.log2(c/sum(feats.values())) for c in feats.values())
return 1.0 - entropy / np.log2(len(feats)) # uniform → 0, peaked → 1
```
### Defensive Pre-commit Hook
```bash
#!/usr/bin/env bash
# .git/hooks/pre-commit
ruff format --quiet .
python -m style_neutralizer **/*.py
```
## 매 결정 기준
| 상황 | Approach |
|---|---|
| Anonymous OSS contribution | LLM rewrite + autoformat |
| Whistleblower | Full mimicry to public author |
| Defensive (detection) | Char n-gram + AST RF |
| Research baseline | Caliskan 2015 features |
**기본값**: autoformat + LLM neutralization for adversarial; char n-gram TF-IDF + RF for detection.
## 🔗 Graph
- 변형: [[Code Obfuscation]]
- Adjacent: [[Differential Privacy]]
## 🤖 LLM 활용
**언제**: style neutralization, mimicry attack, defensive paraphrase.
**언제 X**: ground-truth authorship verification 에 LLM judgment 단독 사용.
## ❌ 안티패턴
- **Autoformatter 만 의존**: AST/lexical features 는 그대로 leak.
- **Identifier rename only**: control-flow signature 가 식별 가능.
- **Single-pass LLM rewrite**: subtle idioms 잔존 — multi-pass 필요.
- **Train/test 동일 repo**: leakage — author-disjoint split 필수.
## 🧪 검증 / 중복
- Verified (Caliskan 2015 USENIX Sec, Abuhamad 2018 CCS).
- 신뢰도 A.
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — AST features, attack/defense patterns |