Files
2nd/10_Wiki/Topics/AI_and_ML/AI Accountability.md
T
2026-05-10 22:08:15 +09:00

13 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, inferred_by, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit inferred_by tech_stack
wiki-2026-0508-ai-accountability AI Accountability 10_Wiki/Topics verified self
AI 책임론
algorithmic accountability
responsibility gap
XAI
model card
audit trail
none B 0.85 conceptual
ai-ethics
accountability
transparency
xai
audit
governance
model-card
redress
2026-05-09 pending Claude Opus 4.7 (manual cleanup 2026-05-09)
language applicable_to
process / engineering
Compliance
Engineering
Legal
Product

AI Accountability

📌 한 줄 통찰 (The Karpathy Summary)

"누구 의 잘못?". AI 의 harm 의 발생 시 매 actor (developer, deployer, user) 의 responsibility 의 chain. Transparency + Auditability + Redress 의 3 pillar. EU AI Act 의 high-risk 의 mandatory.

📖 구조화된 지식 (Synthesized Content)

Responsibility gap

AI 의 autonomy 가 ↑ → 매 traditional liability 가 어려움:

  • Developer: "내 가 algorithm 만 만들었다, output 의 control X".
  • Deployer: "내 가 그냥 사용 했다".
  • User: "내 가 modal 의 trust 했다".
  • Vendor: "ToS 의 disclaimer".

→ 매 actor 의 finger-pointing → 매 victim 의 redress X.

3 Pillar of Accountability

1. Transparency (XAI - Explainable AI)

  • 매 decision 의 reasoning 의 disclose.
  • 매 feature 의 contribution.
  • 매 model 의 training data / architecture.

매 method:

  • SHAP / LIME: 매 input feature 의 contribution.
  • Attention visualization: 매 token / pixel 의 weight.
  • Counterfactual: "이 feature 가 다르면 result 다름".
  • Concept activation: 매 high-level concept 의 detection.

→ 매 user 의 challenge / appeal 가능.

2. Auditability

  • 매 model version 의 reproducibility.
  • 매 training data 의 provenance.
  • 매 decision 의 log.
  • 매 third-party (regulator, court) 의 inspect 가능.

매 element:

  • Model card (Mitchell et al. 2019): 매 model 의 spec / limit.
  • Data sheet (Gebru et al. 2018): 매 training data 의 description.
  • Audit log: 매 production decision 의 record.
  • Version control: model + data 의 git-like.

3. Redress

  • 매 wrong decision 의 review process.
  • 매 victim 의 compensation path.
  • 매 systemic 문제 의 fix.

매 element:

  • Right to explanation (GDPR Article 22).
  • Human review (high-stakes decision).
  • Appeal channel.
  • Class action / regulatory complaint.

Strict liability (제작자)

  • Defective product 식.
  • 매 user 의 prove of fault X.
  • EU 의 AI Liability Directive 의 push.

Fault-based

  • 매 actor 의 negligence prove.
  • 어려움 (algorithm 의 black box).

Insurance

  • 매 deployer 의 mandatory insurance (autonomous vehicle 식).

→ 매 jurisdiction 의 different model.

EU AI Act 의 high-risk 의 obligation

  1. Risk management system: continuous.
  2. Data governance: quality, bias check.
  3. Technical documentation: 매 system 의 detail.
  4. Record keeping: audit log.
  5. Transparency: user 의 disclosure.
  6. Human oversight: 매 decision 의 human review possible.
  7. Accuracy + robustness + cybersecurity.

→ 매 high-risk system 의 compliance burden 큰.

매 industry 의 specific requirement

Medical AI (FDA)

  • 매 model 의 clinical validation.
  • Adverse event reporting.
  • "Predetermined Change Control Plan" (PCCP).
  • Software as Medical Device (SaMD).

Autonomous vehicle

  • 매 incident 의 black box 의 record.
  • DDT (Dynamic Driving Task) responsibility.
  • SAE level 별 driver vs system.

Hiring / HR

  • NYC Local Law 144 의 bias audit (2023+).
  • Disparate impact analysis.
  • Candidate notification.

Credit / lending

  • Adverse action notice (ECOA).
  • Disparate impact (CFPB).
  • Explainability requirement.

Model card example

# model_card.yaml
model_name: ChurnPredictor
version: v3.1
created: 2026-05-09
owner: data-team@company.com

intended_use: |
  Predict customer churn for SaaS billing dashboard.
  Input: 23 user activity features.
  Output: probability 0-1.

intended_users: |
  Customer success team (review + outreach).

out_of_scope:
  - Automatic cancellation.
  - Pricing decisions.

training_data:
  source: 2025-01-01 to 2026-04-30 production users.
  size: 1.2M users.
  potential_bias: |
    - Geographic: 80% US users.
    - Industry: SaaS only.

performance:
  accuracy: 0.87
  auc: 0.91
  f1: 0.83
  per_subgroup:
    - { group: 'US', acc: 0.88 }
    - { group: 'EU', acc: 0.83 }   # disparity
    - { group: 'APAC', acc: 0.79 }  # warning

limitations:
  - Cold start (< 30 day user) 의 accuracy ↓.
  - Class imbalance (10% positive).
  - 2026 의 cohort 만 — drift expected.

ethical_considerations:
  - 매 prediction 의 customer success review.
  - 매 false positive 의 cost = unnecessary outreach.
  - 매 false negative 의 cost = missed retention.

review_cycle: quarterly

→ 매 model 의 spec 의 single doc.

💻 패턴 (Code + Process)

Audit log

async function logAIDecision(input: any, output: any, model: string, user: User) {
  await db.aiDecisionLog.insert({
    timestamp: new Date(),
    modelVersion: model,
    inputHash: sha256(JSON.stringify(input)),
    inputSummary: summarize(input),  // PII-stripped
    output,
    userId: user.id,
    confidence: output.confidence,
    reasoning: output.explanation,  // SHAP / LIME
  });
}

// Retention: 7 year (regulation 친화).

XAI (SHAP)

import shap

# Tree-based model
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# 매 prediction 의 feature contribution.
shap.force_plot(explainer.expected_value, shap_values[0], X_test[0])

# 매 user 의 "why" 의 답.
def explain(prediction):
    contributions = dict(zip(feature_names, shap_values[prediction.id]))
    top_features = sorted(contributions.items(), key=lambda x: -abs(x[1]))[:5]
    return f"Top factors: {top_features}"

Counterfactual explanation

def counterfactual(model, instance, target_class):
    # "What changes flip the prediction?"
    from dice_ml import Dice
    
    dice = Dice(data, model)
    cf = dice.generate_counterfactuals(instance, total_CFs=3, desired_class=target_class)
    return cf.cf_examples_list[0].final_cfs_df

→ 매 user 의 actionable feedback.

Bias audit

def fairness_audit(model, dataset, protected_attribute='gender'):
    results = defaultdict(list)
    for x, y_true, group in dataset:
        y_pred = model.predict(x)
        results[group].append((y_true, y_pred))
    
    metrics = {}
    for group, data in results.items():
        accuracy = sum(t == p for t, p in data) / len(data)
        positive_rate = sum(p for _, p in data) / len(data)
        metrics[group] = {'accuracy': accuracy, 'positive_rate': positive_rate}
    
    # Disparity
    accuracies = [m['accuracy'] for m in metrics.values()]
    disparity = max(accuracies) - min(accuracies)
    
    if disparity > 0.05:
        alert(f'Bias detected: {disparity:.2%} disparity across groups')
    
    return metrics

Right to explanation (GDPR)

@app.route('/api/decisions/<id>/explain', methods=['GET'])
def explain_decision(id):
    decision = db.aiDecisionLog.find(id)
    
    # Verify user access
    if decision.user_id != current_user.id:
        return 403
    
    return {
        'decision': decision.output.value,
        'date': decision.timestamp,
        'reasoning': decision.reasoning,  # SHAP-based
        'top_factors': decision.top_features,
        'how_to_appeal': '/appeal',
        'human_review_available': True,
    }

Appeal workflow

class AppealWorkflow {
  async submit(userId: string, decisionId: string, reason: string) {
    const appeal = await db.appeals.insert({
      userId, decisionId, reason,
      status: 'pending',
      createdAt: new Date(),
    });
    
    // Auto-route to human reviewer
    const reviewer = pickReviewer(decisionId);
    await assign(reviewer, appeal.id);
    
    // SLA: 30 day (GDPR)
    setTimeout(() => escalate(appeal.id), 30 * 86400_000);
    
    return appeal;
  }
}

Model versioning + reproducibility

# DVC + MLflow
dvc add data/train.parquet
git commit -m 'data v1.2'

# Train
mlflow run . -P epochs=10
# → 매 run 의 unique ID, params, metrics, artifacts.

# Reproduce
mlflow run . -P epochs=10 --git-commit=$SHA

→ 매 production model 의 reproducible.

Model card 의 generation

# model_card_toolkit (Google)
import model_card_toolkit as mctk

mct = mctk.ModelCardToolkit()
model_card = mct.scaffold_assets()

model_card.model_details.name = 'ChurnPredictor'
model_card.model_details.overview = '...'
model_card.considerations.ethical_considerations = [...]

mct.update_model_card(model_card)
mct.export_format()  # HTML, JSON

Continuous monitoring (drift / fairness)

@trace
def predict(features):
    pred = model.predict(features)
    
    # Log for audit
    log({'features': features, 'pred': pred, 'model_version': MODEL_V})
    
    # Real-time fairness check (sample)
    if random() < 0.01:
        check_fairness_window()  # 매 hour 의 last 1000 prediction
    
    return pred

🤔 의사결정 기준 (Decision Criteria)

Risk level Accountability requirement
Low (spam filter) Audit log + version
Medium (content moderation) + Transparency + appeal
High (HR, medical, finance) + Bias audit + human review + redress
Critical (autonomous vehicle, life-support) + Black box + insurance + regulator approval

기본값: Audit log + model card 의 매 production AI. High-risk 의 매 EU AI Act 의 mapping.

⚠️ 모순 및 업데이트 (Contradictions & Updates)

  • Black box paradox: 매 deep model 의 explainability 가 inherently limited. SHAP 가 approximation.
  • Trade-off: explainable model 가 performance ↓ 가능 (linear vs deep). High-stakes 의 dilemma.
  • Strict liability 의 push: 매 jurisdiction 가 strict liability 도입 → 매 developer 의 cost ↑. Innovation 의 chill effect 우려.
  • Model audit 의 cost: 매 model 의 audit 가 큰 cost. Open standard 의 emerging.
  • Cross-border: 매 country 의 different regulation. AI 의 global → fragmented.

🔗 지식 연결 (Graph)

🤖 LLM 활용 힌트 (How to Use This Knowledge)

언제 이 지식을 쓰는가:

  • 매 production AI 의 deployment review.
  • 매 incident 의 post-mortem.
  • 매 customer-facing AI 의 transparency design.
  • 매 high-stakes (loan, hire, medical) 의 human review workflow.
  • Regulatory audit 의 prep.

언제 쓰면 안 되는가:

  • Specific legal advice (lawyer).
  • Country-specific regulation 의 implementation (local counsel).
  • Crisis 의 immediate response (incident team).
  • Research model (no production use).

안티패턴 (Anti-Patterns)

  • No audit log: 매 incident 의 root cause X.
  • No model card: future maintainer 의 mystery.
  • No bias audit: silent disparity.
  • No appeal channel: 매 user 의 helpless.
  • Black box + production: regulator + user trust X.
  • One-time audit + then forget: 매 release 의 audit 필요.
  • No version control of model: reproducibility X.
  • Right to explanation 의 ignore: GDPR violation.

🧪 검증 상태 (Validation)

  • 정보 상태: verified (concept-level).
  • 출처 신뢰도: B (NIST AI RMF, EU AI Act, ACM FAccT papers, Microsoft AETHER guidelines, Google PAIR).
  • 검토 이유: Manual cleanup. Active research / regulation. 매 6 month review.

🧬 중복 검사 (Duplicate Check)

  • 기존 유사 문서: AI-Governance-Policy (related), AI-Ethics (parent), Explainable-AI-XAI (subset).
  • 처리 방식: KEEP (focused on accountability mechanism).
  • 처리 이유: Accountability 가 distinct discipline (legal + technical + ethical).

🕓 변경 이력 (Changelog)

날짜 변경 내용 처리 방식 신뢰도
2026-05-08 P-Reinforce Phase 1 정규화 UPDATE A
2026-05-09 Manual cleanup — code pattern + 3 pillar + industry-specific + 안티패턴 추가 UPDATE B