Files

T

koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)

이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-08 12:24:15 +09:00

6.1 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Data Privacy & Local Processing

매 한 줄

"매 data privacy + local processing 의 핵심: data minimization + on-device inference + cryptographic guarantees". 매 GDPR (2018), CCPA, AI Act (2024 EU) 의 regulatory pressure + 매 Apple Intelligence (2024), Google Gemini Nano (2024), 매 on-device LLM (Llama 3.2 1B/3B, Phi-4 mini, Gemma 3 nano) 의 등장 으로 매 2026 현재 cloud → device shift 가 현실화. 매 Local-First Software 운동 의 main-stream 진입.

매 핵심

매 privacy primitives

Data minimization: 매 collect only 필요 — 매 GDPR Art. 5(1)(c).
On-device inference: 매 raw data 의 device 외 미전송.
Differential Privacy (DP): 매 ε-noise — 매 Apple, Google 의 telemetry 사용.
Federated Learning (FL): 매 model 의 device 학습 → gradient aggregate.
Homomorphic Encryption (HE): 매 encrypted compute — 매 latency penalty 큼.
Secure Enclave (TEE): 매 Apple Secure Enclave, Intel SGX, AWS Nitro.
Zero-Knowledge Proof (ZKP): 매 prove without reveal.

매 regulatory landscape (2026)

EU AI Act: 매 high-risk system 의 data governance + transparency.
GDPR: 매 right to erasure, data portability, DPIA.
CCPA / CPRA: 매 California 의 sale opt-out.
HIPAA (US health), PIPEDA (Canada), APPI (Japan), PIPL (China — 매 cross-border data transfer 매우 strict).

매 응용

On-device LLM assistant (Apple Intelligence, Pixel Gemini Nano).
Health apps (HealthKit, on-device biometric ML).
Federated keyboard prediction (Gboard, SwiftKey).
Local-first note apps (Obsidian, Anytype, automerge-based).

💻 패턴

On-device LLM with MLX (Apple Silicon)

from mlx_lm import load, generate
model, tokenizer = load("mlx-community/Llama-3.2-3B-Instruct-4bit")
out = generate(model, tokenizer, prompt="Summarize: ...",
               max_tokens=200, verbose=False)
# Data never leaves device

Differential privacy (Opacus)

from opacus import PrivacyEngine

privacy_engine = PrivacyEngine()
model, optimizer, loader = privacy_engine.make_private_with_epsilon(
    module=model, optimizer=optimizer, data_loader=loader,
    epochs=10, target_epsilon=3.0, target_delta=1e-5, max_grad_norm=1.0,
)

Federated learning (Flower)

import flwr as fl

class Client(fl.client.NumPyClient):
    def get_parameters(self, config): return get_weights(model)
    def fit(self, parameters, config):
        set_weights(model, parameters)
        train_local(model, local_data)
        return get_weights(model), len(local_data), {}

fl.client.start_numpy_client(server_address="server:8080", client=Client())

CoreML on-device inference (iOS / macOS)

import CoreML
let model = try MyModel(configuration: MLModelConfiguration())
let input = try MyModelInput(text: userText)
let output = try model.prediction(input: input)
// Inference never sends data to network

Secure Enclave key wrapping (iOS)

let attrs: [String: Any] = [
  kSecAttrKeyType as String: kSecAttrKeyTypeECSECPrimeRandom,
  kSecAttrKeySizeInBits as String: 256,
  kSecAttrTokenID as String: kSecAttrTokenIDSecureEnclave,
]
var error: Unmanaged<CFError>?
let key = SecKeyCreateRandomKey(attrs as CFDictionary, &error)!

Local-first sync (Yjs / Automerge)

import * as Y from "yjs";
import { IndexeddbPersistence } from "y-indexeddb";

const doc = new Y.Doc();
new IndexeddbPersistence("notes", doc);  // Local persistence
// Optional E2E-encrypted relay for sync

Data redaction before LLM API call (defense in depth)

import re
PII = [r"\b\d{3}-\d{2}-\d{4}\b", r"\b[\w.-]+@[\w.-]+\b"]
def redact(text):
    for p in PII:
        text = re.sub(p, "[REDACTED]", text)
    return text
# Use redact() before sending to remote LLM

매 결정 기준

상황	Approach
Health / financial data	On-device only + TEE
Personalized model	Federated learning
Aggregate analytics	Differential privacy
Multi-party compute	HE / MPC (still slow)
Compliance (GDPR / HIPAA)	DPIA + minimization + audit log
Personal AI assistant	Local LLM (Llama 3.2 3B 4-bit on phone)

기본값: 매 user-content processing 의 default 의 on-device, 매 cloud 의 explicit consent + minimization.

🔗 Graph

부모: Privacy
변형: Federated Learning · Differential Privacy · Homomorphic Encryption (HE)
응용: On-device AI
Adjacent: Edge Computing · Practical-Cryptography · GDPR

🤖 LLM 활용

언제: 매 privacy-impact-assessment drafting, 매 redaction-pipeline scaffolding, 매 GDPR/CCPA compliance checklist generation. 언제 X: 매 actual user PII 의 cloud LLM 의 직접 send X — 매 on-device 또는 redact-first.

❌ 안티패턴

Plaintext PII to cloud LLM: 매 GDPR violation potential.
DP without ε accounting: 매 cumulative leakage 의 무인지.
Federated 의 raw gradient leak: 매 gradient inversion attack — 매 secure aggregation 필요.
Local-first 의 backup absent: 매 device loss = data loss.
"Anonymized" via removing names only: 매 quasi-identifier 의 re-identification.
Storing decryption key alongside ciphertext: 매 obvious 하지만 흔한 fail.

🧪 검증 / 중복

Verified (GDPR text, NIST Privacy Framework, Apple Differential Privacy white papers, Flower & Opacus docs, EU AI Act 2024).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — privacy primitives + on-device LLM 2026

6.1 KiB Raw Blame History