Files

T

Antigravity Agent 504fd5fb42 [G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00

7.6 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit

title

Algorithmic Transparency

📌 한 줄 통찰

"매 black box 의 light". 매 input + algorithm + output 의 visibility. Disclosure → Explainability → Auditability 의 3 layer. 매 user trust + regulatory compliance.

📖 핵심

매 3 layer

Layer 1: Disclosure (basic)

매 AI 의 use 의 fact.
매 purpose.
매 data source (general).
매 user 의 inform.

Layer 2: Explainability (model)

매 prediction 의 reasoning.
SHAP / LIME / counterfactual.
Attention visualization.
Feature importance.

Layer 3: Auditability (regulator / public)

매 model 의 detail (weights, training).
매 audit log.
매 third-party verify.
매 reproducibility.

매 transparency 의 type

Voluntary

매 vendor 의 self-disclose.
Model card (Mitchell 2019).
Datasheet for datasets.
Public benchmark.

Required (regulation)

EU AI Act 의 high-risk.
GDPR Article 22 (right to explanation).
NYC LL144 (hiring AI audit).
China 의 generative AI registration.

Open source

매 weight 의 release.
매 training data 의 (often partial).
매 architecture.

매 transparency 의 spectrum

Level	Example
1. Closed	GPT-4 (architecture 미공개)
2. Documented	GPT-4 (paper 약간)
3. Open weight	Llama 3, Mistral (weight 공개, training 미공개)
4. Reproducible	OLMo (data + code 공개)
5. Auditable	매 third-party 의 audit

→ 매 model 의 different level.

매 user-facing disclosure

"AI used"

매 chatbot 의 explicit.
매 generated content 의 watermark.
매 deepfake 의 disclosure (regulation).

"Why this decision?"

매 loan / hire 의 reason.
GDPR right to explanation.

"Data used"

매 train data summary.
Wikipedia, web crawl, etc.
매 sensitive 의 disclose.

매 model card (Mitchell 2019)

Component:

Model details (name, version, type).
Intended use (primary, out-of-scope).
Performance (per-group).
Training data.
Evaluation data.
Ethical consideration.
Caveat / recommendation.

→ 매 standard.

매 datasheet (Gebru 2018)

Dataset 의 documentation:

Motivation.
Composition.
Collection process.
Preprocessing / labeling.
Uses.
Distribution.
Maintenance.

매 trade-off

IP / competitive

매 full disclosure 의 trade secret 잃음.
매 vendor 의 reluctance.

Security

매 full disclosure 의 adversarial attack.
매 jailbreak 의 easier.

Privacy

매 training data 의 individual identification.
매 GDPR 의 conflict.

User overload

매 too much info 의 overwhelm.
매 simplified summary 필요.

매 best practice

Frontier model

매 model card.
매 capability + limit.
매 known risk.
매 evaluation result.

Production AI

매 user-facing disclosure.
매 explainability (SHAP / LIME).
매 audit log.
매 appeal channel.

Open-source

매 weight.
매 training data (or summary).
매 reproducibility.

💻 Code

Model card (yaml)

model_name: ChurnPredictor
version: 3.1.0
created: 2026-05-09
license: MIT

intended_use: |
  Predict customer churn for SaaS billing dashboard.

intended_users: |
  Customer success team.

out_of_scope:
  - Automatic cancellation
  - Pricing decisions

training_data:
  source: 2025-2026 production users.
  size: 1.2M samples.
  bias_warning: |
    - 80% US customer (geographic bias).
    - 65% B2B SaaS (industry bias).

performance:
  overall: { accuracy: 0.87, auc: 0.91 }
  by_group:
    - { group: 'US', accuracy: 0.88 }
    - { group: 'EU', accuracy: 0.83 }   # disparity
    - { group: 'APAC', accuracy: 0.79 }

ethical_consideration: |
  - 매 prediction 의 customer success review.
  - 매 false positive 의 outreach cost.

review_cycle: quarterly

Datasheet

dataset_name: customer_churn_v3
version: 2026-05
size: 1.2M rows
license: Internal

motivation: |
  Train ML model to predict churn.

composition:
  features:
    - login_frequency: int
    - subscription_tier: enum
    - support_tickets: int
    - payment_method: enum
  
  protected_attributes:
    - country
    - industry
    - account_size

collection:
  source: production database
  method: SQL extract + anonymize
  consent: ToS agreement

preprocessing:
  - PII removed
  - Outliers winsorized

uses:
  recommended:
    - Churn prediction
  not_recommended:
    - Cross-customer analysis (re-identification risk)

XAI 의 user-facing

import shap
import streamlit as st

@app.route('/predictions/<id>/explain')
def explain(id):
    decision = db.predictions.find(id)
    
    explainer = shap.TreeExplainer(model)
    shap_values = explainer.shap_values([decision.features])
    
    top_features = sorted(
        zip(feature_names, shap_values[0]),
        key=lambda x: -abs(x[1])
    )[:5]
    
    return {
        'prediction': decision.value,
        'date': decision.timestamp,
        'top_factors': [
            {'feature': name, 'impact': float(impact)}
            for name, impact in top_features
        ],
        'how_to_appeal': '/appeal',
    }

Audit log

@trace
async def predict(features, user_id):
    pred = model.predict(features)
    
    await db.audit_log.insert({
        'user_id': user_id,
        'features_hash': sha256(features),
        'prediction': pred.value,
        'confidence': pred.confidence,
        'model_version': MODEL_VERSION,
        'timestamp': datetime.now(),
    })
    
    return pred

User disclosure (chatbot)

function ChatHeader() {
  return (
    <div className="ai-disclosure">
      🤖 You're chatting with an AI assistant powered by Claude.
      <a href="/about-ai">Learn more</a>
    </div>
  );
}

🤔 결정 기준

Context	Transparency level
Internal tool	Audit log + model card
Customer-facing	+ User disclosure
Regulated (medical, legal)	+ Audit + explainability + appeal
Frontier (general AI)	+ Capability disclosure + safety eval
Open-source	+ Weight + training summary

기본값: Disclosure + audit log + per-prediction explanation. 매 high-stakes 의 더 strict.

🔗 Graph

부모: AI-Ethics · AI-Governance · AI-Accountability
변형: Explainable-AI-XAI · Model-Card · Datasheet-for-Datasets
응용: GDPR-Article-22 · EU-AI-Act-Transparency · NYC-LL144
Tools: SHAP · LIME · Model-Card-Toolkit-Google
Adjacent: Open-Source-AI · Algorithmic-Fairness · Right-to-Explanation

🤖 LLM 활용

언제: 매 production AI 의 transparency design. 매 user trust 의 build. 언제 X: Specific legal compliance (lawyer). Trade secret area.

❌ 안티패턴

No disclosure: trust 잃음.
Full disclosure + privacy violation: balance.
Model card 의 stale: 매 release 의 update.
"AI 의 use" 의 hide: deception.
Explainability 의 fake: post-hoc rationalize.

🧪 검증 / 중복

Verified.
신뢰도 B.
Related: AI-Accountability · Algorithmic-Fairness.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-09	Manual cleanup — 3 layer + spectrum + model card / datasheet code

7.6 KiB Raw Blame History