--- id: wiki-2026-0508-ai-accountability title: AI Accountability category: 10_Wiki/Topics status: verified canonical_id: self aliases: [AI μ±…μž„λ‘ , algorithmic accountability, responsibility gap, XAI, model card, audit trail] duplicate_of: none source_trust_level: B confidence_score: 0.85 verification_status: conceptual tags: [ai-ethics, accountability, transparency, xai, audit, governance, model-card, redress] raw_sources: [] last_reinforced: 2026-05-09 github_commit: pending inferred_by: Claude Opus 4.7 (manual cleanup 2026-05-09) tech_stack: language: process / engineering applicable_to: [Compliance, Engineering, Legal, Product] --- # AI Accountability ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > **"λˆ„κ΅¬ 의 잘λͺ»?"**. AI 의 harm 의 λ°œμƒ μ‹œ λ§€ actor (developer, deployer, user) 의 responsibility 의 chain. **Transparency + Auditability + Redress** 의 3 pillar. EU AI Act 의 high-risk 의 mandatory. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) ### Responsibility gap AI 의 autonomy κ°€ ↑ β†’ λ§€ traditional liability κ°€ 어렀움: - Developer: "λ‚΄ κ°€ algorithm 만 λ§Œλ“€μ—ˆλ‹€, output 의 control X". - Deployer: "λ‚΄ κ°€ κ·Έλƒ₯ μ‚¬μš© ν–ˆλ‹€". - User: "λ‚΄ κ°€ modal 의 trust ν–ˆλ‹€". - Vendor: "ToS 의 disclaimer". β†’ λ§€ actor 의 finger-pointing β†’ λ§€ victim 의 redress X. ### 3 Pillar of Accountability #### 1. Transparency (XAI - Explainable AI) - λ§€ decision 의 reasoning 의 disclose. - λ§€ feature 의 contribution. - λ§€ model 의 training data / architecture. λ§€ method: - **SHAP / LIME**: λ§€ input feature 의 contribution. - **Attention visualization**: λ§€ token / pixel 의 weight. - **Counterfactual**: "이 feature κ°€ λ‹€λ₯΄λ©΄ result 닀름". - **Concept activation**: λ§€ high-level concept 의 detection. β†’ λ§€ user 의 challenge / appeal κ°€λŠ₯. #### 2. Auditability - λ§€ model version 의 reproducibility. - λ§€ training data 의 provenance. - λ§€ decision 의 log. - λ§€ third-party (regulator, court) 의 inspect κ°€λŠ₯. λ§€ element: - **Model card** (Mitchell et al. 2019): λ§€ model 의 spec / limit. - **Data sheet** (Gebru et al. 2018): λ§€ training data 의 description. - **Audit log**: λ§€ production decision 의 record. - **Version control**: model + data 의 git-like. #### 3. Redress - λ§€ wrong decision 의 review process. - λ§€ victim 의 compensation path. - λ§€ systemic 문제 의 fix. λ§€ element: - **Right to explanation** (GDPR Article 22). - **Human review** (high-stakes decision). - **Appeal channel**. - **Class action / regulatory complaint**. ### Liability framework (legal) #### Strict liability (μ œμž‘μž) - Defective product 식. - λ§€ user 의 prove of fault X. - EU 의 AI Liability Directive 의 push. #### Fault-based - λ§€ actor 의 negligence prove. - 어렀움 (algorithm 의 black box). #### Insurance - λ§€ deployer 의 mandatory insurance (autonomous vehicle 식). β†’ λ§€ jurisdiction 의 different model. ### EU AI Act 의 high-risk 의 obligation 1. **Risk management system**: continuous. 2. **Data governance**: quality, bias check. 3. **Technical documentation**: λ§€ system 의 detail. 4. **Record keeping**: audit log. 5. **Transparency**: user 의 disclosure. 6. **Human oversight**: λ§€ decision 의 human review possible. 7. **Accuracy + robustness + cybersecurity**. β†’ λ§€ high-risk system 의 compliance burden 큰. ### λ§€ industry 의 specific requirement #### Medical AI (FDA) - λ§€ model 의 clinical validation. - Adverse event reporting. - "Predetermined Change Control Plan" (PCCP). - Software as Medical Device (SaMD). #### Autonomous vehicle - λ§€ incident 의 black box 의 record. - DDT (Dynamic Driving Task) responsibility. - SAE level 별 driver vs system. #### Hiring / HR - NYC Local Law 144 의 bias audit (2023+). - Disparate impact analysis. - Candidate notification. #### Credit / lending - Adverse action notice (ECOA). - Disparate impact (CFPB). - Explainability requirement. ### Model card example ```yaml # model_card.yaml model_name: ChurnPredictor version: v3.1 created: 2026-05-09 owner: data-team@company.com intended_use: | Predict customer churn for SaaS billing dashboard. Input: 23 user activity features. Output: probability 0-1. intended_users: | Customer success team (review + outreach). out_of_scope: - Automatic cancellation. - Pricing decisions. training_data: source: 2025-01-01 to 2026-04-30 production users. size: 1.2M users. potential_bias: | - Geographic: 80% US users. - Industry: SaaS only. performance: accuracy: 0.87 auc: 0.91 f1: 0.83 per_subgroup: - { group: 'US', acc: 0.88 } - { group: 'EU', acc: 0.83 } # disparity - { group: 'APAC', acc: 0.79 } # warning limitations: - Cold start (< 30 day user) 의 accuracy ↓. - Class imbalance (10% positive). - 2026 의 cohort 만 β€” drift expected. ethical_considerations: - λ§€ prediction 의 customer success review. - λ§€ false positive 의 cost = unnecessary outreach. - λ§€ false negative 의 cost = missed retention. review_cycle: quarterly ``` β†’ λ§€ model 의 spec 의 single doc. ## πŸ’» νŒ¨ν„΄ (Code + Process) ### Audit log ```ts async function logAIDecision(input: any, output: any, model: string, user: User) { await db.aiDecisionLog.insert({ timestamp: new Date(), modelVersion: model, inputHash: sha256(JSON.stringify(input)), inputSummary: summarize(input), // PII-stripped output, userId: user.id, confidence: output.confidence, reasoning: output.explanation, // SHAP / LIME }); } // Retention: 7 year (regulation μΉœν™”). ``` ### XAI (SHAP) ```python import shap # Tree-based model explainer = shap.TreeExplainer(model) shap_values = explainer.shap_values(X_test) # λ§€ prediction 의 feature contribution. shap.force_plot(explainer.expected_value, shap_values[0], X_test[0]) # λ§€ user 의 "why" 의 λ‹΅. def explain(prediction): contributions = dict(zip(feature_names, shap_values[prediction.id])) top_features = sorted(contributions.items(), key=lambda x: -abs(x[1]))[:5] return f"Top factors: {top_features}" ``` ### Counterfactual explanation ```python def counterfactual(model, instance, target_class): # "What changes flip the prediction?" from dice_ml import Dice dice = Dice(data, model) cf = dice.generate_counterfactuals(instance, total_CFs=3, desired_class=target_class) return cf.cf_examples_list[0].final_cfs_df ``` β†’ λ§€ user 의 actionable feedback. ### Bias audit ```python def fairness_audit(model, dataset, protected_attribute='gender'): results = defaultdict(list) for x, y_true, group in dataset: y_pred = model.predict(x) results[group].append((y_true, y_pred)) metrics = {} for group, data in results.items(): accuracy = sum(t == p for t, p in data) / len(data) positive_rate = sum(p for _, p in data) / len(data) metrics[group] = {'accuracy': accuracy, 'positive_rate': positive_rate} # Disparity accuracies = [m['accuracy'] for m in metrics.values()] disparity = max(accuracies) - min(accuracies) if disparity > 0.05: alert(f'Bias detected: {disparity:.2%} disparity across groups') return metrics ``` ### Right to explanation (GDPR) ```python @app.route('/api/decisions//explain', methods=['GET']) def explain_decision(id): decision = db.aiDecisionLog.find(id) # Verify user access if decision.user_id != current_user.id: return 403 return { 'decision': decision.output.value, 'date': decision.timestamp, 'reasoning': decision.reasoning, # SHAP-based 'top_factors': decision.top_features, 'how_to_appeal': '/appeal', 'human_review_available': True, } ``` ### Appeal workflow ```ts class AppealWorkflow { async submit(userId: string, decisionId: string, reason: string) { const appeal = await db.appeals.insert({ userId, decisionId, reason, status: 'pending', createdAt: new Date(), }); // Auto-route to human reviewer const reviewer = pickReviewer(decisionId); await assign(reviewer, appeal.id); // SLA: 30 day (GDPR) setTimeout(() => escalate(appeal.id), 30 * 86400_000); return appeal; } } ``` ### Model versioning + reproducibility ```bash # DVC + MLflow dvc add data/train.parquet git commit -m 'data v1.2' # Train mlflow run . -P epochs=10 # β†’ λ§€ run 의 unique ID, params, metrics, artifacts. # Reproduce mlflow run . -P epochs=10 --git-commit=$SHA ``` β†’ λ§€ production model 의 reproducible. ### Model card 의 generation ```python # model_card_toolkit (Google) import model_card_toolkit as mctk mct = mctk.ModelCardToolkit() model_card = mct.scaffold_assets() model_card.model_details.name = 'ChurnPredictor' model_card.model_details.overview = '...' model_card.considerations.ethical_considerations = [...] mct.update_model_card(model_card) mct.export_format() # HTML, JSON ``` ### Continuous monitoring (drift / fairness) ```python @trace def predict(features): pred = model.predict(features) # Log for audit log({'features': features, 'pred': pred, 'model_version': MODEL_V}) # Real-time fairness check (sample) if random() < 0.01: check_fairness_window() # λ§€ hour 의 last 1000 prediction return pred ``` ## πŸ€” μ˜μ‚¬κ²°μ • κΈ°μ€€ (Decision Criteria) | Risk level | Accountability requirement | |---|---| | Low (spam filter) | Audit log + version | | Medium (content moderation) | + Transparency + appeal | | High (HR, medical, finance) | + Bias audit + human review + redress | | Critical (autonomous vehicle, life-support) | + Black box + insurance + regulator approval | **κΈ°λ³Έκ°’**: Audit log + model card 의 λ§€ production AI. High-risk 의 λ§€ EU AI Act 의 mapping. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & Updates) - **Black box paradox**: λ§€ deep model 의 explainability κ°€ inherently limited. SHAP κ°€ approximation. - **Trade-off**: explainable model κ°€ performance ↓ κ°€λŠ₯ (linear vs deep). High-stakes 의 dilemma. - **Strict liability 의 push**: λ§€ jurisdiction κ°€ strict liability λ„μž… β†’ λ§€ developer 의 cost ↑. Innovation 의 chill effect 우렀. - **Model audit 의 cost**: λ§€ model 의 audit κ°€ 큰 cost. Open standard 의 emerging. - **Cross-border**: λ§€ country 의 different regulation. AI 의 global β†’ fragmented. ## πŸ”— 지식 μ—°κ²° (Graph) - λΆ€λͺ¨: [[AI-Ethics]] Β· [[AI κ±°λ²„λ„ŒμŠ€ μ •μ±…(AI Usage Policy)|AI-Governance-Policy]] Β· [[Algorithmic Fairness]] - λ³€ν˜•: [[Explainable-AI-XAI]] Β· [[Model-Card]] - 기술: [[LIME]] - Adjacent: [[Responsibility-Gap]] Β· [[Human-in-the-Loop]] - μ‘μš©: [[ML Monitoring β€” drift / quality / SLO]] ## πŸ€– LLM ν™œμš© 힌트 (How to Use This Knowledge) **μ–Έμ œ 이 지식을 μ“°λŠ”κ°€:** - λ§€ production AI 의 deployment review. - λ§€ incident 의 post-mortem. - λ§€ customer-facing AI 의 transparency design. - λ§€ high-stakes (loan, hire, medical) 의 human review workflow. - Regulatory audit 의 prep. **μ–Έμ œ μ“°λ©΄ μ•ˆ λ˜λŠ”κ°€:** - Specific legal advice (lawyer). - Country-specific regulation 의 implementation (local counsel). - Crisis 의 immediate response (incident team). - Research model (no production use). ## ❌ μ•ˆν‹°νŒ¨ν„΄ (Anti-Patterns) - **No audit log**: λ§€ incident 의 root cause X. - **No model card**: future maintainer 의 mystery. - **No bias audit**: silent disparity. - **No appeal channel**: λ§€ user 의 helpless. - **Black box + production**: regulator + user trust X. - **One-time audit + then forget**: λ§€ release 의 audit ν•„μš”. - **No version control of model**: reproducibility X. - **Right to explanation 의 ignore**: GDPR violation. ## πŸ§ͺ 검증 μƒνƒœ (Validation) - **정보 μƒνƒœ:** verified (concept-level). - **좜처 신뒰도:** B (NIST AI RMF, EU AI Act, ACM FAccT papers, Microsoft AETHER guidelines, Google PAIR). - **κ²€ν†  이유:** Manual cleanup. Active research / regulation. λ§€ 6 month review. ## 🧬 쀑볡 검사 (Duplicate Check) - **κΈ°μ‘΄ μœ μ‚¬ λ¬Έμ„œ:** [[AI κ±°λ²„λ„ŒμŠ€ μ •μ±…(AI Usage Policy)|AI-Governance-Policy]] (related), [[AI-Ethics]] (parent), [[Explainable-AI-XAI]] (subset). - **처리 방식:** KEEP (focused on accountability mechanism). - **처리 이유:** Accountability κ°€ distinct discipline (legal + technical + ethical). ## πŸ•“ λ³€κ²½ 이λ ₯ (Changelog) | λ‚ μ§œ | λ³€κ²½ λ‚΄μš© | 처리 방식 | 신뒰도 | |------|-----------|-----------|--------| | 2026-05-08 | P-Reinforce Phase 1 μ •κ·œν™” | UPDATE | A | | 2026-05-09 | Manual cleanup β€” code pattern + 3 pillar + industry-specific + μ•ˆν‹°νŒ¨ν„΄ μΆ”κ°€ | UPDATE | B |