Files
2nd/10_Wiki/Topics/DevOps_and_Security/Adversarial Code Stylometry.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

4.7 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-adversarial-code-stylometry Adversarial Code Stylometry 10_Wiki/Topics verified self
P-Reinforce-AUTO-36585B
Code Authorship Obfuscation
none A 0.9 applied
security
ml
privacy
deanonymization
2026-05-10 pending
language framework
python scikit-learn

Adversarial Code Stylometry

매 한 줄

"매 source code 의 author 를 statistical fingerprint 로 식별 — 그리고 attacker 는 이를 회피한다.". Code stylometry 는 AST features + n-grams + lexical patterns 로 author 를 95% accuracy 로 deanonymize 가능; adversarial stylometry 는 transformation/obfuscation 으로 이를 무력화한다.

매 핵심

매 Feature Family

  • Lexical: identifier length, naming convention, comment density.
  • Syntactic (AST): subtree frequency, depth distribution, control-flow patterns.
  • Layout: indentation, brace style, line length.
  • Semantic: API choice, idiom preference (list comp vs loop).

매 Attack Surface

  • Open-source contributors — GitHub commits 의 deanonymization.
  • Malware authorship — APT attribution.
  • Plagiarism detection — academic/hiring context.
  • Bug bounty / leak — anonymous reporter identification.

매 Defense

  1. Code transformation (Caliskan 2018 — paraphrase preserving semantics).
  2. LLM-mediated rewrite (rewrite via Claude/GPT to neutralize style).
  3. Style transfer to another author (mimicry).
  4. Mechanical normalization (autoformatter + identifier randomization).

💻 패턴

AST Feature Extractor

import ast
from collections import Counter

def ast_node_freq(source: str) -> Counter:
    tree = ast.parse(source)
    return Counter(type(n).__name__ for n in ast.walk(tree))

Author Classifier (sklearn)

from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline

pipe = Pipeline([
    ("tfidf", TfidfVectorizer(analyzer="char_wb", ngram_range=(2, 5))),
    ("rf", RandomForestClassifier(n_estimators=300, n_jobs=-1)),
])
pipe.fit(train_sources, train_authors)

Style Obfuscation via Rewrite

import anthropic

client = anthropic.Anthropic()

def neutralize_style(code: str) -> str:
    msg = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=4096,
        messages=[{"role": "user", "content": f"""Rewrite this code to neutralize authorial style.
Preserve semantics exactly. Use generic identifiers, standard idioms, mechanical formatting.

```python
{code}
```"""}],
    )
    return msg.content[0].text

Mimicry Attack (target style)

def mimic(code: str, target_samples: list[str]) -> str:
    """Rewrite `code` to look like `target_samples` author."""
    target_blob = "\n---\n".join(target_samples[:3])
    prompt = f"Target author samples:\n{target_blob}\n\nRewrite preserving semantics:\n{code}"
    return llm_call(prompt)

Detection of Obfuscated Code

def obfuscation_signal(code: str) -> float:
    """High score → likely autoformatted/normalized."""
    feats = ast_node_freq(code)
    entropy = -sum((c/sum(feats.values())) * np.log2(c/sum(feats.values())) for c in feats.values())
    return 1.0 - entropy / np.log2(len(feats))  # uniform → 0, peaked → 1

Defensive Pre-commit Hook

#!/usr/bin/env bash
# .git/hooks/pre-commit
ruff format --quiet .
python -m style_neutralizer **/*.py

매 결정 기준

상황 Approach
Anonymous OSS contribution LLM rewrite + autoformat
Whistleblower Full mimicry to public author
Defensive (detection) Char n-gram + AST RF
Research baseline Caliskan 2015 features

기본값: autoformat + LLM neutralization for adversarial; char n-gram TF-IDF + RF for detection.

🔗 Graph

🤖 LLM 활용

언제: style neutralization, mimicry attack, defensive paraphrase. 언제 X: ground-truth authorship verification 에 LLM judgment 단독 사용.

안티패턴

  • Autoformatter 만 의존: AST/lexical features 는 그대로 leak.
  • Identifier rename only: control-flow signature 가 식별 가능.
  • Single-pass LLM rewrite: subtle idioms 잔존 — multi-pass 필요.
  • Train/test 동일 repo: leakage — author-disjoint split 필수.

🧪 검증 / 중복

  • Verified (Caliskan 2015 USENIX Sec, Abuhamad 2018 CCS).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — AST features, attack/defense patterns