--- id: wiki-2026-0508-adversarial-code-stylometry title: Adversarial Code Stylometry category: 10_Wiki/Topics status: verified canonical_id: self aliases: [P-Reinforce-AUTO-36585B, Code Authorship Obfuscation] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [security, ml, privacy, deanonymization] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: scikit-learn --- # Adversarial Code Stylometry ## 매 한 줄 > **"매 source code 의 author 를 statistical fingerprint 로 식별 — 그리고 attacker 는 이를 회피한다."**. Code stylometry 는 AST features + n-grams + lexical patterns 로 author 를 95% accuracy 로 deanonymize 가능; adversarial stylometry 는 transformation/obfuscation 으로 이를 무력화한다. ## 매 핵심 ### 매 Feature Family - **Lexical**: identifier length, naming convention, comment density. - **Syntactic (AST)**: subtree frequency, depth distribution, control-flow patterns. - **Layout**: indentation, brace style, line length. - **Semantic**: API choice, idiom preference (list comp vs loop). ### 매 Attack Surface - **Open-source contributors** — GitHub commits 의 deanonymization. - **Malware authorship** — APT attribution. - **Plagiarism detection** — academic/hiring context. - **Bug bounty / leak** — anonymous reporter identification. ### 매 Defense 1. Code transformation (Caliskan 2018 — paraphrase preserving semantics). 2. LLM-mediated rewrite (rewrite via Claude/GPT to neutralize style). 3. Style transfer to another author (mimicry). 4. Mechanical normalization (autoformatter + identifier randomization). ## 💻 패턴 ### AST Feature Extractor ```python import ast from collections import Counter def ast_node_freq(source: str) -> Counter: tree = ast.parse(source) return Counter(type(n).__name__ for n in ast.walk(tree)) ``` ### Author Classifier (sklearn) ```python from sklearn.ensemble import RandomForestClassifier from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.pipeline import Pipeline pipe = Pipeline([ ("tfidf", TfidfVectorizer(analyzer="char_wb", ngram_range=(2, 5))), ("rf", RandomForestClassifier(n_estimators=300, n_jobs=-1)), ]) pipe.fit(train_sources, train_authors) ``` ### Style Obfuscation via Rewrite ```python import anthropic client = anthropic.Anthropic() def neutralize_style(code: str) -> str: msg = client.messages.create( model="claude-opus-4-7", max_tokens=4096, messages=[{"role": "user", "content": f"""Rewrite this code to neutralize authorial style. Preserve semantics exactly. Use generic identifiers, standard idioms, mechanical formatting. ```python {code} ```"""}], ) return msg.content[0].text ``` ### Mimicry Attack (target style) ```python def mimic(code: str, target_samples: list[str]) -> str: """Rewrite `code` to look like `target_samples` author.""" target_blob = "\n---\n".join(target_samples[:3]) prompt = f"Target author samples:\n{target_blob}\n\nRewrite preserving semantics:\n{code}" return llm_call(prompt) ``` ### Detection of Obfuscated Code ```python def obfuscation_signal(code: str) -> float: """High score → likely autoformatted/normalized.""" feats = ast_node_freq(code) entropy = -sum((c/sum(feats.values())) * np.log2(c/sum(feats.values())) for c in feats.values()) return 1.0 - entropy / np.log2(len(feats)) # uniform → 0, peaked → 1 ``` ### Defensive Pre-commit Hook ```bash #!/usr/bin/env bash # .git/hooks/pre-commit ruff format --quiet . python -m style_neutralizer **/*.py ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Anonymous OSS contribution | LLM rewrite + autoformat | | Whistleblower | Full mimicry to public author | | Defensive (detection) | Char n-gram + AST RF | | Research baseline | Caliskan 2015 features | **기본값**: autoformat + LLM neutralization for adversarial; char n-gram TF-IDF + RF for detection. ## 🔗 Graph - 변형: [[Code Obfuscation]] - Adjacent: [[Differential Privacy]] ## 🤖 LLM 활용 **언제**: style neutralization, mimicry attack, defensive paraphrase. **언제 X**: ground-truth authorship verification 에 LLM judgment 단독 사용. ## ❌ 안티패턴 - **Autoformatter 만 의존**: AST/lexical features 는 그대로 leak. - **Identifier rename only**: control-flow signature 가 식별 가능. - **Single-pass LLM rewrite**: subtle idioms 잔존 — multi-pass 필요. - **Train/test 동일 repo**: leakage — author-disjoint split 필수. ## 🧪 검증 / 중복 - Verified (Caliskan 2015 USENIX Sec, Abuhamad 2018 CCS). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — AST features, attack/defense patterns |