---
id: wiki-2026-0508-axiology
title: Axiology
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [Value Theory, Theory of Value, Philosophy of Value]
duplicate_of: none
source_trust_level: A
confidence_score: 0.86
verification_status: applied
tags: [philosophy, ethics, value-theory, ai-alignment, decision-theory]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: Python
  framework: RL/Reward-Modeling
---

# Axiology

## 매 한 줄
> **"매 value 의 study — 매 what 의 X, 매 worth 의 question."**. Axiology 의 ethics + aesthetics 의 unifying framework — intrinsic vs instrumental, monism vs pluralism. 매 2026 의 AI alignment 의 core relevance: reward modeling / Constitutional AI / preference elicitation 의 axiological commitments.

## 매 핵심

### 매 Subdomains
- **Ethics**: moral value (good / right).
- **Aesthetics**: aesthetic value (beautiful / sublime).
- **Epistemology of value**: truth, knowledge value.

### 매 Distinctions
- **Intrinsic** (good in itself, e.g., happiness for hedonist) vs **instrumental** (good for X).
- **Subjective** (depends on attitude) vs **objective** (mind-independent).
- **Monism** (one value, e.g., utility) vs **pluralism** (many incommensurable values).
- **Realist** vs **anti-realist**.

### 매 Major Frames
- **Hedonism** (Bentham, Mill): pleasure / absence of pain.
- **Eudaimonism** (Aristotle): flourishing.
- **Perfectionism**: excellence, capability (Sen, Nussbaum).
- **Consequentialism**: outcomes.
- **Deontology**: duty (Kant).
- **Virtue ethics**: character.
- **Pluralist value (Berlin)**: incommensurable goods.

### 매 AI Alignment Connection (2026)
- **Reward model = axiological model**: implicit value commitment.
- **Constitutional AI** (Anthropic): explicit principles → critique → revise.
- **Preference learning (RLHF, DPO, IPO)**: aggregate human preferences.
- **Pluralism challenge**: whose values? → community / democratic AI.
- **Goodhart's law**: 매 measure → target → corruption (instrumental ≠ intrinsic).

### 매 응용
1. AI alignment / reward design.
2. Cost-benefit analysis (policy).
3. Aesthetic scoring (image gen).
4. Healthcare QALY/DALY weighting.

## 💻 패턴

### Pattern 1 — Multi-objective reward (pluralism)

```python
def reward(traj):
    return (
        1.0 * progress(traj)        # instrumental
      + 0.5 * comfort(traj)         # intrinsic-ish
      + 2.0 * safety(traj)          # constraint priority
      - 0.3 * energy(traj)          # cost
    )
```

### Pattern 2 — Constitutional critique (Anthropic-style)

```python
CONSTITUTION = [
  "Avoid harm.",
  "Be honest.",
  "Respect autonomy.",
  "Promote well-being equitably.",
]

def critique(response, principles=CONSTITUTION):
    return llm.complete(f"Critique against: {principles}\nResponse: {response}")

def revise(response, critique_text):
    return llm.complete(f"Revise: {response}\nIn light of: {critique_text}")
```

### Pattern 3 — Preference elicitation

```python
# binary preference dataset → DPO / IPO
pairs = [{"prompt": p, "chosen": a, "rejected": b}, ...]
# train policy to maximize likelihood ratio
```

### Pattern 4 — Pareto frontier (incommensurable values)

```python
def is_pareto(point, all_points):
    return not any(all(o[i] >= point[i] for i in range(len(point))) and o != point
                   for o in all_points)
```

## 매 결정 기준

| 상황 | Approach |
|---|---|
| Single clear metric | Scalar reward (monism) |
| Multiple comparable | Weighted sum (pluralism reduced) |
| Incommensurable | Pareto / lexicographic |
| Norm uncertainty | Constitutional + critique loop |
| Democratic | Preference aggregation + transparency |

**기본값**: pluralism + transparent weights + constitutional guardrails.

## 🔗 Graph
- 부모: [[Philosophy]]
- 응용: [[AI_Safety_and_Alignment|AI-Alignment]]
- Adjacent: [[Aesthetic-Value]] · [[Decision-Theory]] · [[AI_Safety_and_Alignment|Constitutional-AI]]

## 🤖 LLM 활용
**언제**: alignment policy drafting, principle articulation, value-laden decision review, ethical critique generation.
**언제 X**: pure technical optimization with no value tradeoff, single-stakeholder narrow domain.

## ❌ 안티패턴
- **Hidden monism**: 매 single metric 의 dressed-up — Goodhart 의 vulnerable.
- **False precision**: numeric weight 의 spurious 의 incommensurable values.
- **No stakeholder mapping**: whose values 의 unclear.
- **Reward hacking**: instrumental → intrinsic 의 confuse.

## 🧪 검증 / 중복
- Verified (Stanford Encyclopedia of Philosophy "Value Theory", Anthropic Constitutional AI paper).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — FULL content (frames + AI alignment patterns) |