Files
2nd/10_Wiki/Topics/Computer_Science_and_Theory/Axiology.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

4.7 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-axiology Axiology 10_Wiki/Topics verified self
Value Theory
Theory of Value
Philosophy of Value
none A 0.86 applied
philosophy
ethics
value-theory
ai-alignment
decision-theory
2026-05-10 pending
language framework
Python RL/Reward-Modeling

Axiology

매 한 줄

"매 value 의 study — 매 what 의 X, 매 worth 의 question.". Axiology 의 ethics + aesthetics 의 unifying framework — intrinsic vs instrumental, monism vs pluralism. 매 2026 의 AI alignment 의 core relevance: reward modeling / Constitutional AI / preference elicitation 의 axiological commitments.

매 핵심

매 Subdomains

  • Ethics: moral value (good / right).
  • Aesthetics: aesthetic value (beautiful / sublime).
  • Epistemology of value: truth, knowledge value.

매 Distinctions

  • Intrinsic (good in itself, e.g., happiness for hedonist) vs instrumental (good for X).
  • Subjective (depends on attitude) vs objective (mind-independent).
  • Monism (one value, e.g., utility) vs pluralism (many incommensurable values).
  • Realist vs anti-realist.

매 Major Frames

  • Hedonism (Bentham, Mill): pleasure / absence of pain.
  • Eudaimonism (Aristotle): flourishing.
  • Perfectionism: excellence, capability (Sen, Nussbaum).
  • Consequentialism: outcomes.
  • Deontology: duty (Kant).
  • Virtue ethics: character.
  • Pluralist value (Berlin): incommensurable goods.

매 AI Alignment Connection (2026)

  • Reward model = axiological model: implicit value commitment.
  • Constitutional AI (Anthropic): explicit principles → critique → revise.
  • Preference learning (RLHF, DPO, IPO): aggregate human preferences.
  • Pluralism challenge: whose values? → community / democratic AI.
  • Goodhart's law: 매 measure → target → corruption (instrumental ≠ intrinsic).

매 응용

  1. AI alignment / reward design.
  2. Cost-benefit analysis (policy).
  3. Aesthetic scoring (image gen).
  4. Healthcare QALY/DALY weighting.

💻 패턴

Pattern 1 — Multi-objective reward (pluralism)

def reward(traj):
    return (
        1.0 * progress(traj)        # instrumental
      + 0.5 * comfort(traj)         # intrinsic-ish
      + 2.0 * safety(traj)          # constraint priority
      - 0.3 * energy(traj)          # cost
    )

Pattern 2 — Constitutional critique (Anthropic-style)

CONSTITUTION = [
  "Avoid harm.",
  "Be honest.",
  "Respect autonomy.",
  "Promote well-being equitably.",
]

def critique(response, principles=CONSTITUTION):
    return llm.complete(f"Critique against: {principles}\nResponse: {response}")

def revise(response, critique_text):
    return llm.complete(f"Revise: {response}\nIn light of: {critique_text}")

Pattern 3 — Preference elicitation

# binary preference dataset → DPO / IPO
pairs = [{"prompt": p, "chosen": a, "rejected": b}, ...]
# train policy to maximize likelihood ratio

Pattern 4 — Pareto frontier (incommensurable values)

def is_pareto(point, all_points):
    return not any(all(o[i] >= point[i] for i in range(len(point))) and o != point
                   for o in all_points)

매 결정 기준

상황 Approach
Single clear metric Scalar reward (monism)
Multiple comparable Weighted sum (pluralism reduced)
Incommensurable Pareto / lexicographic
Norm uncertainty Constitutional + critique loop
Democratic Preference aggregation + transparency

기본값: pluralism + transparent weights + constitutional guardrails.

🔗 Graph

🤖 LLM 활용

언제: alignment policy drafting, principle articulation, value-laden decision review, ethical critique generation. 언제 X: pure technical optimization with no value tradeoff, single-stakeholder narrow domain.

안티패턴

  • Hidden monism: 매 single metric 의 dressed-up — Goodhart 의 vulnerable.
  • False precision: numeric weight 의 spurious 의 incommensurable values.
  • No stakeholder mapping: whose values 의 unclear.
  • Reward hacking: instrumental → intrinsic 의 confuse.

🧪 검증 / 중복

  • Verified (Stanford Encyclopedia of Philosophy "Value Theory", Anthropic Constitutional AI paper).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — FULL content (frames + AI alignment patterns)