Files
2nd/10_Wiki/Topics/AI_and_ML/Beliefs.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

225 lines
7.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-beliefs
title: Beliefs
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [신념, belief revision, Bayesian belief, knowledge, confirmation bias, doxastic logic]
duplicate_of: none
source_trust_level: B
confidence_score: 0.85
verification_status: applied
tags: [epistemology, beliefs, knowledge, bayesian, confirmation-bias, ai-belief, doxastic-logic]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: epistemology / cognitive science
applicable_to: [Agent Beliefs, RAG Trust, Bias Mitigation]
---
# Beliefs
## 📌 한 줄 통찰
> **"매 mind 의 잠정적 결론"**. 매 evidence 의 objective ↔ subjective 의 confidence. 매 action 의 trigger. 매 AI 의 응용 — 매 agent 의 belief state, 매 RAG 의 trust scoring, 매 confirmation bias 의 detect.
## 📖 핵심
### 매 정의 (philosophical)
- **Belief**: 매 proposition 의 true 의 mental acceptance.
- **Knowledge**: 매 Justified True Belief (Plato).
- **Gettier problem**: JTB 가 X 의 case (Gettier 1963).
- → 매 knowledge 의 stricter (no luck / safety / sensitivity).
### 매 belief 의 type
1. **Occurrent**: 매 active conscious thought.
2. **Dispositional**: 매 stored, 매 retrieve 매 ready.
3. **De dicto vs de re**: 매 about-words vs about-thing.
4. **Implicit / explicit**: 매 articulate-able.
### 매 belief revision (AGM)
- **Expansion**: 매 add (no conflict).
- **Contraction**: 매 remove.
- **Revision**: 매 add + remove 매 conflicting.
- **Postulates**: 매 closure, success, consistency, ...
### Bayesian belief
- 매 belief = 매 probability (degree of confidence).
- 매 update via Bayes (Cox theorem).
- 매 coherent.
- 매 modern AI 의 standard.
### 매 cognitive bias (belief 관련)
1. **Confirmation bias**: 매 belief 의 confirm 의 selective.
2. **Belief perseverance**: 매 disconfirming evidence 후 의 retain.
3. **Backfire effect**: 매 disconfirming evidence 의 strengthen.
4. **Sunk cost**: 매 commitment 의 belief 의 maintain.
5. **Motivated reasoning**: 매 want 의 believe.
### 매 AI / agent 의 응용
#### Belief state (POMDP)
- 매 partially observable.
- 매 belief = 매 distribution over state.
- 매 action 의 belief 의 update.
#### RAG trust score
- 매 retrieved document 의 belief.
- 매 confidence = recency × authority × consistency.
#### Multi-agent BDI (Belief-Desire-Intention)
- 매 belief: world state.
- 매 desire: goal.
- 매 intention: committed plan.
- 매 PRS, JADE.
#### LLM 의 belief
- 매 train 의 belief 의 instillation.
- 매 RLHF 의 alignment.
- 매 calibration: 매 P(true) 의 actual frequency.
### 매 epistemic logic
- 매 K_a φ: 매 agent a 의 knows φ.
- 매 B_a φ: 매 belief.
- 매 multi-agent: 매 common knowledge.
- 매 Aumann's agreement theorem: 매 rational 의 동의.
## 💻 패턴 (응용)
### Bayesian belief update
```python
def update_belief(prior, likelihood_true, likelihood_false, evidence):
# P(H | E) = P(E | H) * P(H) / P(E)
posterior_unnorm = likelihood_true * prior
evidence_prob = likelihood_true * prior + likelihood_false * (1 - prior)
return posterior_unnorm / evidence_prob
belief = 0.3 # 매 prior
belief = update_belief(belief, 0.9, 0.2, evidence=True) # 매 0.66
belief = update_belief(belief, 0.9, 0.2, evidence=True) # 매 0.90
```
### POMDP belief state
```python
class POMDPBelief:
def __init__(self, n_states, prior):
self.belief = prior # np.array, sum=1
def update(self, action, observation, T, O):
# T: transition matrix, O: observation matrix
new_belief = np.zeros_like(self.belief)
for s_next in range(len(self.belief)):
new_belief[s_next] = O[s_next, observation] * \
sum(T[s, s_next, action] * self.belief[s] for s in range(len(self.belief)))
new_belief /= new_belief.sum()
self.belief = new_belief
```
### BDI agent
```python
class BDIAgent:
def __init__(self):
self.beliefs = {} # 매 facts about world
self.desires = [] # 매 goals
self.intentions = [] # 매 active plans
def perceive(self, observations):
for obs in observations:
self.beliefs[obs.key] = obs.value
def deliberate(self):
# 매 desire selection based on belief
feasible = [d for d in self.desires if self.is_feasible(d)]
return max(feasible, key=lambda d: d.priority)
def plan(self, goal):
# 매 belief 기반 의 plan
return planner.plan(self.beliefs, goal)
def execute(self):
if not self.intentions:
goal = self.deliberate()
self.intentions = self.plan(goal)
action = self.intentions.pop(0)
return action
```
### LLM calibration
```python
def calibration_check(model, eval_set):
# 매 P(true) 의 declared confidence vs actual
bins = [(0, 0.1), (0.1, 0.2), ..., (0.9, 1.0)]
bin_correct = {b: [] for b in bins}
for example in eval_set:
response = model.generate(example.prompt + ' Reply with answer and confidence (0-1).')
ans, conf = parse(response)
actual = (ans == example.expected)
for b in bins:
if b[0] <= conf < b[1]:
bin_correct[b].append(actual)
break
# 매 ECE (Expected Calibration Error)
ece = sum(abs(np.mean(corr) - (b[0]+b[1])/2) * len(corr) / len(eval_set)
for b, corr in bin_correct.items() if corr)
return ece
```
→ 매 well-calibrated = ECE 낮음.
### Confirmation bias detector
```python
def detect_confirmation_bias(query, results, user_belief):
# 매 user 의 belief 의 align 의 source 만 의 click?
aligning = [r for r in results if r.aligns_with(user_belief)]
clicked_aligning = sum(1 for r in aligning if r.clicked)
clicked_total = sum(1 for r in results if r.clicked)
if clicked_total == 0: return None
bias_ratio = clicked_aligning / clicked_total
return bias_ratio # 매 > 0.7 = 매 strong confirmation bias
```
## 🤔 결정 기준
| 응용 | Approach |
|---|---|
| Agent world model | POMDP belief |
| RAG trust | Source authority + consistency |
| Multi-agent | BDI |
| LLM calibration | ECE + temperature scaling |
| User UX | Diverse perspective + bias detect |
| Knowledge graph | Justified belief (provenance) |
**기본값**: Bayesian belief + ECE calibration + diverse evidence.
## 🔗 Graph
- 부모: [[Epistemology]]
- 변형: [[Knowledge]] · [[Bayesian-Belief]] · [[Doxastic-Logic]]
- 응용: [[POMDP]]
- 비판: [[Confirmation Bias]]
- Adjacent: [[Bayesian-Brain-Hypothesis]] · [[Multi-agent-System|Multi-Agent-Systems]]
## 🤖 LLM 활용
**언제**: 매 agent design (belief state). 매 RAG trust scoring. 매 LLM calibration eval. 매 bias detection.
**언제 X**: 매 metaphysical claim 의 substitute. 매 single belief 의 deterministic system.
## ❌ 안티패턴
- **Belief 의 binary**: 매 confidence 의 lose.
- **No update**: 매 stale belief.
- **Confirmation bias 의 ignore**: 매 echo chamber.
- **Calibration 무시**: 매 over-confident model.
- **Multiple agent 의 belief 의 share assumption**: 매 multi-agent fail.
- **Belief 의 hard-code**: 매 update 의 X.
## 🧪 검증 / 중복
- Verified (Plato JTB, Gettier, AGM postulates, Bayesian).
- 신뢰도 B.
- Related: [[Bayesian Statistics]] · [[Bayesian-Brain-Hypothesis]] · [[Confirmation Bias]] · [[POMDP]].
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — JTB + AGM + Bayesian + POMDP / BDI + 매 calibration code |