f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
159 lines
5.7 KiB
Markdown
159 lines
5.7 KiB
Markdown
---
|
|
id: wiki-2026-0508-ensuring-data-privacy
|
|
title: Ensuring Data Privacy
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [Data Privacy, Privacy Engineering, GDPR Compliance]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [privacy, gdpr, security, compliance]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: applied
|
|
tech_stack:
|
|
language: Python/TypeScript
|
|
framework: OneTrust/Fides/OPA
|
|
---
|
|
|
|
# Ensuring Data Privacy
|
|
|
|
## 매 한 줄
|
|
> **"매 personal data 가 lawful basis + minimum + purpose-limited 로 다뤄진다."**. Data privacy engineering 은 매 GDPR/CCPA/LGPD/K-PIPA 의 legal requirement 를 매 storage, processing, transfer, retention 의 매 단계 에 deterministic control 로 구현. 2026 stack: classification + DLP + tokenization/PETs (DP, FHE, TEE) + consent management + DSAR automation + privacy-by-design.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 Privacy Principle (GDPR Art.5)
|
|
1. **Lawfulness, fairness, transparency** — consent / legitimate interest.
|
|
2. **Purpose limitation** — 매 collected purpose 외 사용 금지.
|
|
3. **Data minimization** — 매 필요한 최소.
|
|
4. **Accuracy** — correctable.
|
|
5. **Storage limitation** — retention schedule.
|
|
6. **Integrity & confidentiality** — encryption.
|
|
7. **Accountability** — DPO, audit, DPIA.
|
|
|
|
### 매 PET (Privacy-Enhancing Tech) 2026
|
|
- **Pseudonymization**: tokenization, format-preserving encryption (FPE).
|
|
- **Anonymization**: k-anonymity, l-diversity, t-closeness.
|
|
- **Differential Privacy**: ε,δ noise — Apple, US Census, Chrome.
|
|
- **Federated learning**: 매 model travels, data stays.
|
|
- **Homomorphic encryption (FHE)**: 매 compute on encrypted — Microsoft SEAL, OpenFHE.
|
|
- **Confidential computing (TEE)**: Intel TDX, AMD SEV-SNP, Apple Private Cloud Compute.
|
|
- **Zero-Knowledge Proofs**: identity 증명 without disclose.
|
|
|
|
### 매 응용
|
|
1. EU GDPR + 한국 PIPA + 중국 PIPL compliance.
|
|
2. Healthcare HIPAA, PCI-DSS payment.
|
|
3. ML training without raw data (FL, DP).
|
|
4. Cross-border transfer (SCC, BCR, DPF).
|
|
5. Right to be forgotten (RTBF) automation.
|
|
|
|
## 💻 패턴
|
|
|
|
### Data classification + DLP
|
|
```python
|
|
# 매 PII detection — Microsoft Presidio
|
|
from presidio_analyzer import AnalyzerEngine
|
|
analyzer = AnalyzerEngine()
|
|
results = analyzer.analyze(text=user_input, language='en',
|
|
entities=['EMAIL_ADDRESS','PHONE_NUMBER','CREDIT_CARD','PERSON','KR_RRN'])
|
|
for r in results: redact_or_mask(text, r.start, r.end)
|
|
```
|
|
|
|
### Format-preserving tokenization
|
|
```python
|
|
# 매 ff3-1 — preserves format (e.g., card number)
|
|
from ff3 import FF3Cipher
|
|
c = FF3Cipher(key, tweak)
|
|
token = c.encrypt("4242424242424242") # → 16-digit string
|
|
plain = c.decrypt(token)
|
|
```
|
|
|
|
### Differential Privacy noise
|
|
```python
|
|
import numpy as np
|
|
def laplace_mechanism(true_val, sensitivity, epsilon):
|
|
return true_val + np.random.laplace(0, sensitivity / epsilon)
|
|
# 매 query: count of users in segment
|
|
noisy_count = laplace_mechanism(true_count=1234, sensitivity=1, epsilon=1.0)
|
|
```
|
|
|
|
### k-anonymity check
|
|
```python
|
|
import pandas as pd
|
|
def k_anonymity(df: pd.DataFrame, quasi_ids: list[str]) -> int:
|
|
return df.groupby(quasi_ids).size().min()
|
|
# 매 ensure k>=5 before release
|
|
assert k_anonymity(df, ['zip','age','gender']) >= 5
|
|
```
|
|
|
|
### DSAR (Data Subject Access Request) automation
|
|
```python
|
|
async def dsar_export(user_id: str) -> bytes:
|
|
bundle = {
|
|
'profile': await db.users.find_one({'_id':user_id}),
|
|
'orders': [o async for o in db.orders.find({'userId':user_id})],
|
|
'logs': await elasticsearch_export(user_id),
|
|
}
|
|
return json.dumps(bundle, default=str).encode()
|
|
|
|
async def dsar_erasure(user_id: str):
|
|
await db.users.update_one({'_id':user_id},
|
|
{'$set': {'email':None,'name':None,'erasedAt':datetime.utcnow()}})
|
|
await s3.delete_objects(Bucket='pii', Prefix=f'users/{user_id}/')
|
|
```
|
|
|
|
### Consent record (Fides/IAB TCF)
|
|
```typescript
|
|
const consent = {
|
|
userId: 'u_123',
|
|
purposes: { analytics: true, marketing: false, personalization: true },
|
|
vendors: { google: true },
|
|
timestamp: new Date().toISOString(),
|
|
version: 'tcf-2.2',
|
|
signature: hmac(record),
|
|
};
|
|
await db.consents.insertOne(consent);
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | Approach |
|
|
|---|---|
|
|
| EU users | GDPR + Schrems II SCC |
|
|
| 한국 users | PIPA — 개인정보처리방침, 위탁 동의 |
|
|
| Aggregate analytics | Differential Privacy |
|
|
| Payment data | PCI-DSS tokenization |
|
|
| ML training | Federated learning + DP |
|
|
| Cross-org compute | TEE (Confidential Computing) |
|
|
|
|
**기본값**: 매 minimize + classify + tokenize + consent ledger + DSAR API.
|
|
|
|
## 🔗 Graph
|
|
- 부모: [[Practical-Cryptography]] · [[보안_및_시스템_신뢰성_표준|Symmetric-Encryption]]
|
|
- 변형: [[보안_및_시스템_신뢰성_표준|Zero-Trust Architecture]]
|
|
- 응용: [[Anomaly-Detection]] · [[Information-Society]]
|
|
- Adjacent: [[Digital Intellectual Property Rights]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: privacy policy 검토, DSAR response draft, PIA 질문 generation.
|
|
**언제 X**: 매 PII 를 third-party LLM 에 raw 로 전송 — anonymize 먼저.
|
|
|
|
## ❌ 안티패턴
|
|
- **Hash = anonymized 오해**: 매 hash 는 pseudonymization, GDPR 적용.
|
|
- **Consent on entry-only**: 매 ongoing — withdrawable, granular.
|
|
- **Log PII**: 매 logger 가 leak source — redact filter.
|
|
- **Forever retention**: 매 GDPR 위반 — TTL + erasure.
|
|
- **Plaintext backup**: 매 encryption at rest 필수.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified: GDPR Art.5/17/25; ISO/IEC 27701; NIST SP 800-188; Microsoft Presidio docs.
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — principles + PETs + DSAR/DP patterns |
|