--- id: wiki-2026-0508-binary-author-identification title: Binary Author Identification category: 10_Wiki/Topics status: needs_review canonical_id: self aliases: [P-Reinforce-AUTO-BAID-001] duplicate_of: none source_trust_level: A confidence_score: 0.91 tags: [auto-reinforced, binary-Analysis, code-stylometry, security, author-identification, ml-security, de-compilation] raw_sources: [] last_reinforced: 2026-04-20 github_commit: pending inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08) tech_stack: language: unspecified framework: unspecified --- # [[Binary-Author-Identification|Binary-Author-Identification]] ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "λ””μ§€ν„Έ μ§€λ¬Έ 좔적: 기계어(Binary)둜 λ²ˆμ—­λ˜μ–΄ κ°œμ„±μ΄ 사라진 쀄 μ•Œμ•˜λ˜ μ½”λ“œ μ†μ—μ„œλ„, μ½”λ”© μŠ΅κ΄€κ³Ό 라이브러리 μ‚¬μš© νŒ¨ν„΄ λ“± 개발자 고유의 'μŠ€νƒ€μΌ'을 AIκ°€ 감지해내어 μ›λž˜ λˆ„κ°€ μ§  μ½”λ“œμΈμ§€ μ°Ύμ•„λ‚΄λŠ” λ³΄μ•ˆ ν¬λ Œμ‹ 기술." ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) λ°”μ΄λ„ˆλ¦¬ μ €μž 식별(Binary-Author-Identification)은 컴파일된 μ‹€ν–‰ νŒŒμΌμ—μ„œ μ†ŒμŠ€ μ½”λ“œμ˜ μ €μžλ₯Ό νŠΉμ •ν•˜λŠ” 연ꡬ λΆ„μ•Όμž…λ‹ˆλ‹€. (Caliskan-Islam λ“±μ˜ 연ꡬ가 λŒ€ν‘œμ ) 1. **핡심 기법**: * **Feature Extraction**: μ œμ–΄ 흐름 κ·Έλž˜ν”„(CFG), ν•¨μˆ˜ 호좜 λΉˆλ„, λ ˆμ§€μŠ€ν„° μ‚¬μš© νŒ¨ν„΄ λ“± λ°”μ΄λ„ˆλ¦¬ μˆ˜μ€€μ˜ νŠΉμ§• μΆ”μΆœ. * **Stylometric Analysis**: κ°€λ³€μˆ˜ 이름이 사라진 μƒνƒœμ—μ„œλ„ λ‚¨μ•„μžˆλŠ” κ³ μœ ν•œ 'μ½”λ“œ μŠ€νƒ€μΌ μ§€λ¬Έ' 뢄석. * **Deep Learning**: λ°”μ΄λ„ˆλ¦¬ μ‹œν€€μŠ€λ₯Ό μž„λ² λ”©ν•˜μ—¬ μœ μ‚¬λ„λ₯Ό μΈ‘μ •ν•˜λŠ” 신경망 λͺ¨λΈ 적용. ([[Representation-Learning|Representation-Learning]]와 μ—°κ²°) 2. **μ™œ μ€‘μš”ν•œκ°€?**: * μ•…μ„±μ½”λ“œ μ œμž‘μž 좔적, μ˜€ν”ˆμ†ŒμŠ€ μ €μž‘κΆŒ λ„μš© 적발 λ“± 사이버 λ³΄μ•ˆ ν¬λ Œμ‹ λΆ„μ•Όμ—μ„œ 결정적인 증거λ₯Ό μ œκ³΅ν•˜κΈ° λ•Œλ¬Έμž„. (Risk-[[Management|Management]]와 μ—°κ²°) ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & Updates) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌**: κ³Όκ±°μ—λŠ” 컴파일러 μ΅œμ ν™” μ •μ±…([[Optimization|Optimization]])이 μŠ€νƒ€μΌμ„ λͺ¨λ‘ 날렀버렀 식별이 λΆˆκ°€λŠ₯ν•˜λ‹€κ³  λ―Ώμ—ˆμœΌλ‚˜, ν˜„λŒ€ AI 정책은 μ΅œμ ν™” 후에도 λ‚¨λŠ” λ―Έμ„Έν•œ 편ν–₯μ„± μ •μ±…(Bias)을 μž‘μ•„λ‚΄λŠ” 데 성곡함(RL Update). - **μ •μ±… λ³€ν™”(RL Update)**: μ΅œκ·Όμ—λŠ” AI κ°€ μ½”λ“œλ₯Ό μ§œλŠ” μ‹œλŒ€(GitHub Copilot λ“±)κ°€ μ˜€λ©΄μ„œ, μ‚¬λžŒμ΄ μ§  μ½”λ“œμ™€ AI κ°€ μ§  μ½”λ“œλ₯Ό κ΅¬λΆ„ν•˜κ±°λ‚˜ νŠΉμ • AI λͺ¨λΈμ˜ μŠ€νƒ€μΌμ„ μ‹λ³„ν•˜λŠ” 연ꡬ μ •μ±…μœΌλ‘œ ν™•μž₯ μ€‘μž„. ## πŸ”— 지식 μ—°κ²° (Graph) - [[Risk-Management|Risk-Management]], [[Representation-Learning|Representation-Learning]], Security, [[Source-Control|Source-Control]], [[Feature-Engineering|Feature-Engineering]] - **Key [[Research|Research]]ers**: Aylin Caliskan, Arvind Narayanan. --- ## πŸ€– LLM ν™œμš© 힌트 (How to Use This Knowledge) **μ–Έμ œ 이 지식을 μ“°λŠ”κ°€:** - *(TODO)* **μ–Έμ œ μ“°λ©΄ μ•ˆ λ˜λŠ”κ°€:** - *(TODO)* ## πŸ§ͺ 검증 μƒνƒœ (Validation) - **정보 μƒνƒœ:** needs_review - **좜처 신뒰도:** A - **κ²€ν†  이유:** *(P-Reinforce Phase 1 μžλ™ μ •κ·œν™”. λ³Έλ¬Έ 검증 ν•„μš”.)* ## 🧬 쀑볡 검사 (Duplicate Check) - **κΈ°μ‘΄ μœ μ‚¬ λ¬Έμ„œ:** *(TODO: μΈλ±μ„œ ν΄λŸ¬μŠ€ν„° 리포트 μ°Έμ‘°)* - **처리 방식:** UPDATE (μžλ™ μ •κ·œν™”) - **처리 이유:** Phase 1 μ •κ·œν™” β€” μ˜› ν…œν”Œλ¦Ώ/λˆ„λ½ ν•„λ“œ 보강. ## πŸ•“ λ³€κ²½ 이λ ₯ (Changelog) | λ‚ μ§œ | λ³€κ²½ λ‚΄μš© | 처리 방식 | 신뒰도 | |------|-----------|-----------|--------| | 2026-05-08 | P-Reinforce Phase 1 μ •κ·œν™” (frontmatter + 헀더 ν‘œμ€€ν™”) | UPDATE | A | ## πŸ’» μ½”λ“œ νŒ¨ν„΄ (Code Patterns) **νŒ¨ν„΄ 1:** *(TODO: 이 ν”„λ‘œμ νŠΈ μ»¨λ²€μ…˜ λ°˜μ˜ν•œ ꡬ쑰 μŠ€μΌˆλ ˆν†€)* ```text # TODO ``` ## πŸ€” μ˜μ‚¬κ²°μ • κΈ°μ€€ (Decision Criteria) **선택 Aλ₯Ό 써야 ν•  λ•Œ:** - *(TODO)* **선택 Bλ₯Ό 써야 ν•  λ•Œ:** - *(TODO)* **κΈ°λ³Έκ°’:** > *(TODO)* ## ❌ μ•ˆν‹°νŒ¨ν„΄ (Anti-Patterns) - **[μ•ˆν‹°νŒ¨ν„΄]:** *(TODO: 무엇을 ν•˜λ©΄ μ•ˆ λ˜λŠ”κ°€ + 이유 + λŒ€μ‹  무엇을)*