--- id: wiki-2026-0508-ieee-p36521 title: IEEE P3652.1 (Federated ML Standard) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [IEEE 3652.1-2020, IEEE Federated ML Standard, P3652.1] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [federated-learning, ieee, standards, privacy, ml] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: flower-pysyft --- # IEEE P3652.1 (Federated ML Standard) ## 매 한 줄 > **"매 federated learning 의 첫 공식 표준 — 누가 누구와 무엇을 어떻게 공유하는지를 정의한다"**. IEEE 3652.1-2020 (Guide for Architectural Framework and Application of Federated Machine Learning) 은 WeBank/Tencent/Microsoft 등이 주도해 horizontal/vertical/federated transfer learning 의 분류, 참여자 역할, 보안 요구사항을 표준화한 첫 공식 문서로, 현재 GDPR/HIPAA/금융권 cross-silo 학습의 reference 가 되었다. ## 매 핵심 ### 매 3 가지 federated learning 분류 (3652.1) - **Horizontal FL (HFL)**: 같은 feature space, 다른 sample (예: 여러 병원의 동일 항목 환자 데이터). - **Vertical FL (VFL)**: 같은 sample 일부, 다른 feature (예: 은행 + 이커머스가 공통 고객의 다른 속성). - **Federated Transfer Learning (FTL)**: feature/sample 둘 다 부분 겹침 — transfer learning 결합. ### 매 참여자 역할 - **Data Owner / Client**: 로컬 데이터 보유, 로컬 학습 수행. - **Coordinator / Aggregator**: 모델 파라미터/그래디언트 집계 (FedAvg 등). - **Auditor**: privacy/compliance 검증. - **Model Consumer**: 최종 모델 사용자. ### 매 보안/프라이버시 요구사항 - secure aggregation (cryptographic) 권장. - differential privacy 옵션. - 통신 채널 암호화 (TLS 1.3+). - model inversion / membership inference 위험 평가. - audit log + reproducibility. ### 매 응용 1. Cross-hospital 의료 영상 모델 (HFL). 2. 은행 + 통신사 신용평가 (VFL). 3. 모바일 키보드 next-word prediction (Gboard, HFL on-device). 4. 광고 conversion modeling (clean room + FL). ## 💻 패턴 ### 1. Flower (FL framework) — HFL 클라이언트 ```python import flwr as fl import torch class HospitalClient(fl.client.NumPyClient): def __init__(self, model, train_loader, val_loader): self.model, self.train, self.val = model, train_loader, val_loader def get_parameters(self, config): return [v.cpu().numpy() for v in self.model.state_dict().values()] def set_parameters(self, params): sd = {k: torch.tensor(v) for k, v in zip(self.model.state_dict(), params)} self.model.load_state_dict(sd, strict=True) def fit(self, params, config): self.set_parameters(params) train_one_epoch(self.model, self.train) return self.get_parameters({}), len(self.train.dataset), {} def evaluate(self, params, config): self.set_parameters(params) loss, acc = eval_model(self.model, self.val) return float(loss), len(self.val.dataset), {"acc": acc} fl.client.start_numpy_client(server_address="agg.example:8443", client=HospitalClient(...)) ``` ### 2. FedAvg aggregator (Flower server) ```python import flwr as fl strategy = fl.server.strategy.FedAvg( min_fit_clients=5, min_available_clients=5, fraction_fit=1.0, fraction_evaluate=1.0, ) fl.server.start_server( server_address="0.0.0.0:8443", config=fl.server.ServerConfig(num_rounds=20), strategy=strategy, ) ``` ### 3. Secure aggregation (PySyft / Flower SecAgg+) ```python from flwr.common import SecAggPlusWorkflow workflow = SecAggPlusWorkflow( num_shares=3, reconstruction_threshold=2, max_weight=16384, ) # server: clients 가 mask 적용 후 전송 — server 는 합계만 복원, 개별 불가 ``` ### 4. Differential Privacy (Opacus) ```python from opacus import PrivacyEngine engine = PrivacyEngine() model, optimizer, train_loader = engine.make_private_with_epsilon( module=model, optimizer=optimizer, data_loader=train_loader, target_epsilon=3.0, target_delta=1e-5, epochs=10, max_grad_norm=1.0, ) ``` ### 5. VFL — split learning skeleton (PyTorch) ```python # Bank: bottom model on transactions class BottomBank(nn.Module): def forward(self, x): return self.net(x) # -> embed_bank # Telco: bottom model on call patterns class BottomTelco(nn.Module): ... # -> embed_telco # Aggregator (top): concat + classify class Top(nn.Module): def forward(self, eb, et): return self.head(torch.cat([eb, et], dim=-1)) # 학습: client 는 embed 만 송신, gradient 만 수신 ``` ### 6. Audit log (3652.1 Annex B 권장) ```json { "round": 7, "ts": "2026-05-10T09:00:00Z", "participants": ["hosp-a", "hosp-b", "hosp-c"], "aggregation": "FedAvg", "secagg": "SecAgg+", "dp": { "epsilon": 3.0, "delta": 1e-5 }, "model_hash": "sha256:...", "signed_by": "ed25519:..." } ``` ### 7. Cross-silo deployment (Kubernetes manifest 일부) ```yaml apiVersion: apps/v1 kind: Deployment metadata: { name: fl-client-hospA, namespace: hospital-a } spec: template: spec: containers: - name: client image: registry/fl-client:1.4 env: - { name: AGGREGATOR_URL, value: "https://agg.consortium.example:8443" } - { name: TLS_CA, valueFrom: { secretKeyRef: { name: ca, key: ca.crt } } } - { name: SITE_ID, value: "hosp-a" } volumeMounts: - { name: data, mountPath: /data, readOnly: true } ``` ### 8. Membership inference attack 검증 ```python # attacker tries to infer if a sample was in training set def attack_score(model, x): with torch.no_grad(): return model(x).softmax(-1).max().item() # 학습 / 비학습 sample 의 score 분포 차이 → AUC 0.5 에 가까울수록 안전 ``` ### 9. participant onboarding checklist ```yaml participant: hospital-c checklist: - data_governance_signed: true - dpa_signed: true - tls_cert_valid_until: 2027-01-01 - schema_version: v3 - feature_alignment_test: passed - privacy_budget_allocated: { epsilon: 5.0, delta: 1e-5 } ``` ### 10. FATE (WeBank reference impl) job (KubeFATE) ```yaml # horizontal_lr.yaml component_parameters: common: homo_lr_0: penalty: L2 max_iter: 30 learning_rate: 0.1 role: guest: { "0": { reader_0: { table: { name: hosp_a, namespace: hetero } } } } host: { "0": { reader_0: { table: { name: hosp_b, namespace: hetero } } } } ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | 같은 schema, 여러 사이트 | HFL (FedAvg) | | 같은 user, 다른 feature | VFL (split learning) | | 일부 겹침 | FTL | | 모바일 device 수만 명 | Cross-device FL (Flower / TFF) | | 규제 산업 cross-org | 3652.1 + SecAgg + DP + audit log | **기본값**: Cross-silo (수십 사이트) 는 Flower + SecAgg+ + DP(eps≤8) + 3652.1 audit log. ## 🔗 Graph - 부모: [[Federated-Learning]] - Adjacent: [[Differential-Privacy]] · [[GDPR]] ## 🤖 LLM 활용 **언제**: 3652.1 의 분류 (HFL/VFL/FTL) 매핑, audit log schema 초안, threat model checklist. **언제 X**: 실제 cryptographic protocol 구현 — 검증된 lib (Flower, FATE) 사용, LLM 자작 금지. ## ❌ 안티패턴 - **secagg 없는 raw gradient 송신**: gradient inversion 으로 raw data 복원 가능. - **DP 없이 over-fitting model 공개**: membership inference 위험. - **audit log 미보존**: 규제 incident 시 책임 분리 불가. - **schema drift 무시**: client 마다 다른 feature order → silent corruption. - **drop-out client 처리 누락**: SecAgg 가 reconstruction_threshold 미달 시 round 실패. ## 🧪 검증 / 중복 - Verified (IEEE 3652.1-2020 official PDF, Flower docs 1.x, FATE docs 2026). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — 3652.1 분류 + Flower/FATE 패턴 + SecAgg/DP |