[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,21 +2,157 @@
 id: wiki-2026-0508-ensuring-data-privacy
 title: Ensuring Data Privacy
 category: 10_Wiki/Topics
-status: merged
-redirect_to: 보안_및_시스템_신뢰성_표준
-canonical_id: wiki-2026-0507-039
-aliases: []
+status: verified
+canonical_id: self
+aliases: [Data Privacy, Privacy Engineering, GDPR Compliance]
 duplicate_of: none
 source_trust_level: A
-confidence_score: 0.92
-tags: [uncategorized]
+confidence_score: 0.9
+verification_status: applied
+tags: [privacy, gdpr, security, compliance]
 raw_sources: []
-last_reinforced: 2026-05-08
-github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
+last_reinforced: 2026-05-10
+github_commit: applied
+tech_stack:
+  language: Python/TypeScript
+  framework: OneTrust/Fides/OPA
 ---

-# Redirect
+# Ensuring Data Privacy

-이 문서는 Canonical 문서인 [[보안_및_시스템_신뢰성_표준]]으로 통합되었습니다.
-모든 최신 지식과 세부 내용은 위 링크를 참조하십시오.
+## 매 한 줄
+> **"매 personal data 가 lawful basis + minimum + purpose-limited 로 다뤄진다."**. Data privacy engineering 은 매 GDPR/CCPA/LGPD/K-PIPA 의 legal requirement 를 매 storage, processing, transfer, retention 의 매 단계 에 deterministic control 로 구현. 2026 stack: classification + DLP + tokenization/PETs (DP, FHE, TEE) + consent management + DSAR automation + privacy-by-design.
+
+## 매 핵심
+
+### 매 Privacy Principle (GDPR Art.5)
+1. **Lawfulness, fairness, transparency** — consent / legitimate interest.
+2. **Purpose limitation** — 매 collected purpose 외 사용 금지.
+3. **Data minimization** — 매 필요한 최소.
+4. **Accuracy** — correctable.
+5. **Storage limitation** — retention schedule.
+6. **Integrity & confidentiality** — encryption.
+7. **Accountability** — DPO, audit, DPIA.
+
+### 매 PET (Privacy-Enhancing Tech) 2026
+- **Pseudonymization**: tokenization, format-preserving encryption (FPE).
+- **Anonymization**: k-anonymity, l-diversity, t-closeness.
+- **Differential Privacy**: ε,δ noise — Apple, US Census, Chrome.
+- **Federated learning**: 매 model travels, data stays.
+- **Homomorphic encryption (FHE)**: 매 compute on encrypted — Microsoft SEAL, OpenFHE.
+- **Confidential computing (TEE)**: Intel TDX, AMD SEV-SNP, Apple Private Cloud Compute.
+- **Zero-Knowledge Proofs**: identity 증명 without disclose.
+
+### 매 응용
+1. EU GDPR + 한국 PIPA + 중국 PIPL compliance.
+2. Healthcare HIPAA, PCI-DSS payment.
+3. ML training without raw data (FL, DP).
+4. Cross-border transfer (SCC, BCR, DPF).
+5. Right to be forgotten (RTBF) automation.
+
+## 💻 패턴
+
+### Data classification + DLP
+```python
+# 매 PII detection — Microsoft Presidio
+from presidio_analyzer import AnalyzerEngine
+analyzer = AnalyzerEngine()
+results = analyzer.analyze(text=user_input, language='en',
+  entities=['EMAIL_ADDRESS','PHONE_NUMBER','CREDIT_CARD','PERSON','KR_RRN'])
+for r in results: redact_or_mask(text, r.start, r.end)
+```
+
+### Format-preserving tokenization
+```python
+# 매 ff3-1 — preserves format (e.g., card number)
+from ff3 import FF3Cipher
+c = FF3Cipher(key, tweak)
+token = c.encrypt("4242424242424242")  # → 16-digit string
+plain = c.decrypt(token)
+```
+
+### Differential Privacy noise
+```python
+import numpy as np
+def laplace_mechanism(true_val, sensitivity, epsilon):
+    return true_val + np.random.laplace(0, sensitivity / epsilon)
+# 매 query: count of users in segment
+noisy_count = laplace_mechanism(true_count=1234, sensitivity=1, epsilon=1.0)
+```
+
+### k-anonymity check
+```python
+import pandas as pd
+def k_anonymity(df: pd.DataFrame, quasi_ids: list[str]) -> int:
+    return df.groupby(quasi_ids).size().min()
+# 매 ensure k>=5 before release
+assert k_anonymity(df, ['zip','age','gender']) >= 5
+```
+
+### DSAR (Data Subject Access Request) automation
+```python
+async def dsar_export(user_id: str) -> bytes:
+    bundle = {
+      'profile': await db.users.find_one({'_id':user_id}),
+      'orders':  [o async for o in db.orders.find({'userId':user_id})],
+      'logs':    await elasticsearch_export(user_id),
+    }
+    return json.dumps(bundle, default=str).encode()
+
+async def dsar_erasure(user_id: str):
+    await db.users.update_one({'_id':user_id},
+        {'$set': {'email':None,'name':None,'erasedAt':datetime.utcnow()}})
+    await s3.delete_objects(Bucket='pii', Prefix=f'users/{user_id}/')
+```
+
+### Consent record (Fides/IAB TCF)
+```typescript
+const consent = {
+  userId: 'u_123',
+  purposes: { analytics: true, marketing: false, personalization: true },
+  vendors: { google: true },
+  timestamp: new Date().toISOString(),
+  version: 'tcf-2.2',
+  signature: hmac(record),
+};
+await db.consents.insertOne(consent);
+```
+
+## 매 결정 기준
+| 상황 | Approach |
+|---|---|
+| EU users | GDPR + Schrems II SCC |
+| 한국 users | PIPA — 개인정보처리방침, 위탁 동의 |
+| Aggregate analytics | Differential Privacy |
+| Payment data | PCI-DSS tokenization |
+| ML training | Federated learning + DP |
+| Cross-org compute | TEE (Confidential Computing) |
+
+**기본값**: 매 minimize + classify + tokenize + consent ledger + DSAR API.
+
+## 🔗 Graph
+- 부모: [[Practical-Cryptography]] · [[Symmetric-Encryption]]
+- 변형: [[Zero-Trust Architecture]]
+- 응용: [[Anomaly-Detection]] · [[Information-Society]]
+- Adjacent: [[Digital Intellectual Property Rights]]
+
+## 🤖 LLM 활용
+**언제**: privacy policy 검토, DSAR response draft, PIA 질문 generation.
+**언제 X**: 매 PII 를 third-party LLM 에 raw 로 전송 — anonymize 먼저.
+
+## ❌ 안티패턴
+- **Hash = anonymized 오해**: 매 hash 는 pseudonymization, GDPR 적용.
+- **Consent on entry-only**: 매 ongoing — withdrawable, granular.
+- **Log PII**: 매 logger 가 leak source — redact filter.
+- **Forever retention**: 매 GDPR 위반 — TTL + erasure.
+- **Plaintext backup**: 매 encryption at rest 필수.
+
+## 🧪 검증 / 중복
+- Verified: GDPR Art.5/17/25; ISO/IEC 27701; NIST SP 800-188; Microsoft Presidio docs.
+- 신뢰도 A.
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — principles + PETs + DSAR/DP patterns |