---
id: wiki-2026-0508-adversarial-code-stylometry
title: Adversarial Code Stylometry
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [P-Reinforce-AUTO-36585B, Code Authorship Obfuscation]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [security, ml, privacy, deanonymization]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: python
  framework: scikit-learn
---

# Adversarial Code Stylometry

## 매 한 줄
> **"매 source code 의 author 를 statistical fingerprint 로 식별 — 그리고 attacker 는 이를 회피한다."**. Code stylometry 는 AST features + n-grams + lexical patterns 로 author 를 95% accuracy 로 deanonymize 가능; adversarial stylometry 는 transformation/obfuscation 으로 이를 무력화한다.

## 매 핵심

### 매 Feature Family
- **Lexical**: identifier length, naming convention, comment density.
- **Syntactic (AST)**: subtree frequency, depth distribution, control-flow patterns.
- **Layout**: indentation, brace style, line length.
- **Semantic**: API choice, idiom preference (list comp vs loop).

### 매 Attack Surface
- **Open-source contributors** — GitHub commits 의 deanonymization.
- **Malware authorship** — APT attribution.
- **Plagiarism detection** — academic/hiring context.
- **Bug bounty / leak** — anonymous reporter identification.

### 매 Defense
1. Code transformation (Caliskan 2018 — paraphrase preserving semantics).
2. LLM-mediated rewrite (rewrite via Claude/GPT to neutralize style).
3. Style transfer to another author (mimicry).
4. Mechanical normalization (autoformatter + identifier randomization).

## 💻 패턴

### AST Feature Extractor
```python
import ast
from collections import Counter

def ast_node_freq(source: str) -> Counter:
    tree = ast.parse(source)
    return Counter(type(n).__name__ for n in ast.walk(tree))
```

### Author Classifier (sklearn)
```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline

pipe = Pipeline([
    ("tfidf", TfidfVectorizer(analyzer="char_wb", ngram_range=(2, 5))),
    ("rf", RandomForestClassifier(n_estimators=300, n_jobs=-1)),
])
pipe.fit(train_sources, train_authors)
```

### Style Obfuscation via Rewrite
```python
import anthropic

client = anthropic.Anthropic()

def neutralize_style(code: str) -> str:
    msg = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=4096,
        messages=[{"role": "user", "content": f"""Rewrite this code to neutralize authorial style.
Preserve semantics exactly. Use generic identifiers, standard idioms, mechanical formatting.

```python
{code}
```"""}],
    )
    return msg.content[0].text
```

### Mimicry Attack (target style)
```python
def mimic(code: str, target_samples: list[str]) -> str:
    """Rewrite `code` to look like `target_samples` author."""
    target_blob = "\n---\n".join(target_samples[:3])
    prompt = f"Target author samples:\n{target_blob}\n\nRewrite preserving semantics:\n{code}"
    return llm_call(prompt)
```

### Detection of Obfuscated Code
```python
def obfuscation_signal(code: str) -> float:
    """High score → likely autoformatted/normalized."""
    feats = ast_node_freq(code)
    entropy = -sum((c/sum(feats.values())) * np.log2(c/sum(feats.values())) for c in feats.values())
    return 1.0 - entropy / np.log2(len(feats))  # uniform → 0, peaked → 1
```

### Defensive Pre-commit Hook
```bash
#!/usr/bin/env bash
# .git/hooks/pre-commit
ruff format --quiet .
python -m style_neutralizer **/*.py
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| Anonymous OSS contribution | LLM rewrite + autoformat |
| Whistleblower | Full mimicry to public author |
| Defensive (detection) | Char n-gram + AST RF |
| Research baseline | Caliskan 2015 features |

**기본값**: autoformat + LLM neutralization for adversarial; char n-gram TF-IDF + RF for detection.

## 🔗 Graph
- 변형: [[Code Obfuscation]]
- Adjacent: [[Differential Privacy]]

## 🤖 LLM 활용
**언제**: style neutralization, mimicry attack, defensive paraphrase.
**언제 X**: ground-truth authorship verification 에 LLM judgment 단독 사용.

## ❌ 안티패턴
- **Autoformatter 만 의존**: AST/lexical features 는 그대로 leak.
- **Identifier rename only**: control-flow signature 가 식별 가능.
- **Single-pass LLM rewrite**: subtle idioms 잔존 — multi-pass 필요.
- **Train/test 동일 repo**: leakage — author-disjoint split 필수.

## 🧪 검증 / 중복
- Verified (Caliskan 2015 USENIX Sec, Abuhamad 2018 CCS).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — AST features, attack/defense patterns |