13 KiB
13 KiB
id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, inferred_by, tech_stack
| id | title | category | status | canonical_id | aliases | duplicate_of | source_trust_level | confidence_score | verification_status | tags | raw_sources | last_reinforced | github_commit | inferred_by | tech_stack | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wiki-2026-0508-ai-accountability | AI Accountability | 10_Wiki/Topics | verified | self |
|
none | B | 0.85 | conceptual |
|
2026-05-09 | pending | Claude Opus 4.7 (manual cleanup 2026-05-09) |
|
AI Accountability
📌 한 줄 통찰 (The Karpathy Summary)
"누구 의 잘못?". AI 의 harm 의 발생 시 매 actor (developer, deployer, user) 의 responsibility 의 chain. Transparency + Auditability + Redress 의 3 pillar. EU AI Act 의 high-risk 의 mandatory.
📖 구조화된 지식 (Synthesized Content)
Responsibility gap
AI 의 autonomy 가 ↑ → 매 traditional liability 가 어려움:
- Developer: "내 가 algorithm 만 만들었다, output 의 control X".
- Deployer: "내 가 그냥 사용 했다".
- User: "내 가 modal 의 trust 했다".
- Vendor: "ToS 의 disclaimer".
→ 매 actor 의 finger-pointing → 매 victim 의 redress X.
3 Pillar of Accountability
1. Transparency (XAI - Explainable AI)
- 매 decision 의 reasoning 의 disclose.
- 매 feature 의 contribution.
- 매 model 의 training data / architecture.
매 method:
- SHAP / LIME: 매 input feature 의 contribution.
- Attention visualization: 매 token / pixel 의 weight.
- Counterfactual: "이 feature 가 다르면 result 다름".
- Concept activation: 매 high-level concept 의 detection.
→ 매 user 의 challenge / appeal 가능.
2. Auditability
- 매 model version 의 reproducibility.
- 매 training data 의 provenance.
- 매 decision 의 log.
- 매 third-party (regulator, court) 의 inspect 가능.
매 element:
- Model card (Mitchell et al. 2019): 매 model 의 spec / limit.
- Data sheet (Gebru et al. 2018): 매 training data 의 description.
- Audit log: 매 production decision 의 record.
- Version control: model + data 의 git-like.
3. Redress
- 매 wrong decision 의 review process.
- 매 victim 의 compensation path.
- 매 systemic 문제 의 fix.
매 element:
- Right to explanation (GDPR Article 22).
- Human review (high-stakes decision).
- Appeal channel.
- Class action / regulatory complaint.
Liability framework (legal)
Strict liability (제작자)
- Defective product 식.
- 매 user 의 prove of fault X.
- EU 의 AI Liability Directive 의 push.
Fault-based
- 매 actor 의 negligence prove.
- 어려움 (algorithm 의 black box).
Insurance
- 매 deployer 의 mandatory insurance (autonomous vehicle 식).
→ 매 jurisdiction 의 different model.
EU AI Act 의 high-risk 의 obligation
- Risk management system: continuous.
- Data governance: quality, bias check.
- Technical documentation: 매 system 의 detail.
- Record keeping: audit log.
- Transparency: user 의 disclosure.
- Human oversight: 매 decision 의 human review possible.
- Accuracy + robustness + cybersecurity.
→ 매 high-risk system 의 compliance burden 큰.
매 industry 의 specific requirement
Medical AI (FDA)
- 매 model 의 clinical validation.
- Adverse event reporting.
- "Predetermined Change Control Plan" (PCCP).
- Software as Medical Device (SaMD).
Autonomous vehicle
- 매 incident 의 black box 의 record.
- DDT (Dynamic Driving Task) responsibility.
- SAE level 별 driver vs system.
Hiring / HR
- NYC Local Law 144 의 bias audit (2023+).
- Disparate impact analysis.
- Candidate notification.
Credit / lending
- Adverse action notice (ECOA).
- Disparate impact (CFPB).
- Explainability requirement.
Model card example
# model_card.yaml
model_name: ChurnPredictor
version: v3.1
created: 2026-05-09
owner: data-team@company.com
intended_use: |
Predict customer churn for SaaS billing dashboard.
Input: 23 user activity features.
Output: probability 0-1.
intended_users: |
Customer success team (review + outreach).
out_of_scope:
- Automatic cancellation.
- Pricing decisions.
training_data:
source: 2025-01-01 to 2026-04-30 production users.
size: 1.2M users.
potential_bias: |
- Geographic: 80% US users.
- Industry: SaaS only.
performance:
accuracy: 0.87
auc: 0.91
f1: 0.83
per_subgroup:
- { group: 'US', acc: 0.88 }
- { group: 'EU', acc: 0.83 } # disparity
- { group: 'APAC', acc: 0.79 } # warning
limitations:
- Cold start (< 30 day user) 의 accuracy ↓.
- Class imbalance (10% positive).
- 2026 의 cohort 만 — drift expected.
ethical_considerations:
- 매 prediction 의 customer success review.
- 매 false positive 의 cost = unnecessary outreach.
- 매 false negative 의 cost = missed retention.
review_cycle: quarterly
→ 매 model 의 spec 의 single doc.
💻 패턴 (Code + Process)
Audit log
async function logAIDecision(input: any, output: any, model: string, user: User) {
await db.aiDecisionLog.insert({
timestamp: new Date(),
modelVersion: model,
inputHash: sha256(JSON.stringify(input)),
inputSummary: summarize(input), // PII-stripped
output,
userId: user.id,
confidence: output.confidence,
reasoning: output.explanation, // SHAP / LIME
});
}
// Retention: 7 year (regulation 친화).
XAI (SHAP)
import shap
# Tree-based model
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
# 매 prediction 의 feature contribution.
shap.force_plot(explainer.expected_value, shap_values[0], X_test[0])
# 매 user 의 "why" 의 답.
def explain(prediction):
contributions = dict(zip(feature_names, shap_values[prediction.id]))
top_features = sorted(contributions.items(), key=lambda x: -abs(x[1]))[:5]
return f"Top factors: {top_features}"
Counterfactual explanation
def counterfactual(model, instance, target_class):
# "What changes flip the prediction?"
from dice_ml import Dice
dice = Dice(data, model)
cf = dice.generate_counterfactuals(instance, total_CFs=3, desired_class=target_class)
return cf.cf_examples_list[0].final_cfs_df
→ 매 user 의 actionable feedback.
Bias audit
def fairness_audit(model, dataset, protected_attribute='gender'):
results = defaultdict(list)
for x, y_true, group in dataset:
y_pred = model.predict(x)
results[group].append((y_true, y_pred))
metrics = {}
for group, data in results.items():
accuracy = sum(t == p for t, p in data) / len(data)
positive_rate = sum(p for _, p in data) / len(data)
metrics[group] = {'accuracy': accuracy, 'positive_rate': positive_rate}
# Disparity
accuracies = [m['accuracy'] for m in metrics.values()]
disparity = max(accuracies) - min(accuracies)
if disparity > 0.05:
alert(f'Bias detected: {disparity:.2%} disparity across groups')
return metrics
Right to explanation (GDPR)
@app.route('/api/decisions/<id>/explain', methods=['GET'])
def explain_decision(id):
decision = db.aiDecisionLog.find(id)
# Verify user access
if decision.user_id != current_user.id:
return 403
return {
'decision': decision.output.value,
'date': decision.timestamp,
'reasoning': decision.reasoning, # SHAP-based
'top_factors': decision.top_features,
'how_to_appeal': '/appeal',
'human_review_available': True,
}
Appeal workflow
class AppealWorkflow {
async submit(userId: string, decisionId: string, reason: string) {
const appeal = await db.appeals.insert({
userId, decisionId, reason,
status: 'pending',
createdAt: new Date(),
});
// Auto-route to human reviewer
const reviewer = pickReviewer(decisionId);
await assign(reviewer, appeal.id);
// SLA: 30 day (GDPR)
setTimeout(() => escalate(appeal.id), 30 * 86400_000);
return appeal;
}
}
Model versioning + reproducibility
# DVC + MLflow
dvc add data/train.parquet
git commit -m 'data v1.2'
# Train
mlflow run . -P epochs=10
# → 매 run 의 unique ID, params, metrics, artifacts.
# Reproduce
mlflow run . -P epochs=10 --git-commit=$SHA
→ 매 production model 의 reproducible.
Model card 의 generation
# model_card_toolkit (Google)
import model_card_toolkit as mctk
mct = mctk.ModelCardToolkit()
model_card = mct.scaffold_assets()
model_card.model_details.name = 'ChurnPredictor'
model_card.model_details.overview = '...'
model_card.considerations.ethical_considerations = [...]
mct.update_model_card(model_card)
mct.export_format() # HTML, JSON
Continuous monitoring (drift / fairness)
@trace
def predict(features):
pred = model.predict(features)
# Log for audit
log({'features': features, 'pred': pred, 'model_version': MODEL_V})
# Real-time fairness check (sample)
if random() < 0.01:
check_fairness_window() # 매 hour 의 last 1000 prediction
return pred
🤔 의사결정 기준 (Decision Criteria)
| Risk level | Accountability requirement |
|---|---|
| Low (spam filter) | Audit log + version |
| Medium (content moderation) | + Transparency + appeal |
| High (HR, medical, finance) | + Bias audit + human review + redress |
| Critical (autonomous vehicle, life-support) | + Black box + insurance + regulator approval |
기본값: Audit log + model card 의 매 production AI. High-risk 의 매 EU AI Act 의 mapping.
⚠️ 모순 및 업데이트 (Contradictions & Updates)
- Black box paradox: 매 deep model 의 explainability 가 inherently limited. SHAP 가 approximation.
- Trade-off: explainable model 가 performance ↓ 가능 (linear vs deep). High-stakes 의 dilemma.
- Strict liability 의 push: 매 jurisdiction 가 strict liability 도입 → 매 developer 의 cost ↑. Innovation 의 chill effect 우려.
- Model audit 의 cost: 매 model 의 audit 가 큰 cost. Open standard 의 emerging.
- Cross-border: 매 country 의 different regulation. AI 의 global → fragmented.
🔗 지식 연결 (Graph)
- 부모: AI-Ethics · AI-Governance-Policy · Algorithmic-Fairness
- 변형: Explainable-AI-XAI · Model-Card · Datasheets-for-Datasets · Bias-Audit
- 응용: EU-AI-Act-Compliance · GDPR-Article-22 · NYC-Local-Law-144 · FDA-AI-SaMD
- 기술: SHAP-Interpretability · LIME · Counterfactual-Explanation · DVC-MLflow-Versioning
- Adjacent: AI-Liability · Responsibility-Gap · Human-in-the-Loop · Right-to-Explanation
- 응용: MLOps-Model-Monitoring · Continuous-Learning-System · AI-Audit-Log
🤖 LLM 활용 힌트 (How to Use This Knowledge)
언제 이 지식을 쓰는가:
- 매 production AI 의 deployment review.
- 매 incident 의 post-mortem.
- 매 customer-facing AI 의 transparency design.
- 매 high-stakes (loan, hire, medical) 의 human review workflow.
- Regulatory audit 의 prep.
언제 쓰면 안 되는가:
- Specific legal advice (lawyer).
- Country-specific regulation 의 implementation (local counsel).
- Crisis 의 immediate response (incident team).
- Research model (no production use).
❌ 안티패턴 (Anti-Patterns)
- No audit log: 매 incident 의 root cause X.
- No model card: future maintainer 의 mystery.
- No bias audit: silent disparity.
- No appeal channel: 매 user 의 helpless.
- Black box + production: regulator + user trust X.
- One-time audit + then forget: 매 release 의 audit 필요.
- No version control of model: reproducibility X.
- Right to explanation 의 ignore: GDPR violation.
🧪 검증 상태 (Validation)
- 정보 상태: verified (concept-level).
- 출처 신뢰도: B (NIST AI RMF, EU AI Act, ACM FAccT papers, Microsoft AETHER guidelines, Google PAIR).
- 검토 이유: Manual cleanup. Active research / regulation. 매 6 month review.
🧬 중복 검사 (Duplicate Check)
- 기존 유사 문서: AI-Governance-Policy (related), AI-Ethics (parent), Explainable-AI-XAI (subset).
- 처리 방식: KEEP (focused on accountability mechanism).
- 처리 이유: Accountability 가 distinct discipline (legal + technical + ethical).
🕓 변경 이력 (Changelog)
| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
|---|---|---|---|
| 2026-05-08 | P-Reinforce Phase 1 정규화 | UPDATE | A |
| 2026-05-09 | Manual cleanup — code pattern + 3 pillar + industry-specific + 안티패턴 추가 | UPDATE | B |