Files
2nd/10_Wiki/Topics/AI_and_ML/Cognitive Biases.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

325 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: wiki-2026-0508-cognitive-biases
title: Cognitive Biases
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [인지 편향, cognitive biases, heuristics, Tversky-Kahneman, Thinking Fast and Slow, debiasing, nudge]
duplicate_of: none
source_trust_level: A
confidence_score: 0.93
verification_status: applied
tags: [psychology, cognitive-bias, kahneman, behavioral-economics, debiasing, nudge, decision-making, ml-bias]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
language: psychology / decision theory
applicable_to: [Decision Systems, Product Design, ML Bias Mitigation]
---
# Cognitive Biases
## 매 한 줄
> **"매 thinking 의 shortcut 의 trap"**. Kahneman 의 System 1 (fast / heuristic) vs System 2 (slow / logical). 매 evolutionary 의 useful, 매 modern context 의 misfire. 매 modern AI 의 bias source. 매 design 의 leverage (nudge) or 매 mitigation (debiasing).
## 매 핵심
### 매 major bias
#### Cognitive
- **Confirmation bias**: 매 belief 의 support 만.
- **Availability heuristic**: 매 recent / vivid.
- **Anchoring**: 매 first number.
- **Representativeness**: 매 stereotype.
- **Hindsight**: 매 "I knew it".
- **Survivorship**: 매 winner 만 의 분석.
- **Sunk cost**: 매 already-invested 의 maintain.
#### Social
- **In-group bias**: 매 our group 의 prefer.
- **Authority bias**: 매 expert 의 over-trust.
- **Bandwagon**: 매 majority 의 follow.
- **Halo effect**: 매 1 trait → 매 all.
#### Self
- **Dunning-Kruger**: 매 incompetent 의 over-confident.
- **Fundamental attribution**: 매 others = 매 character, 매 self = 매 situation.
- **Self-serving**: 매 success = self, 매 failure = environment.
- **Optimism bias**: 매 future 의 over-rosy.
#### Loss
- **Loss aversion**: 매 loss > 매 gain (2× weight).
- **Endowment effect**: 매 own 의 over-value.
- **Status quo bias**: 매 default keep.
### Kahneman: System 1 vs System 2
| System 1 | System 2 |
|---|---|
| Fast | Slow |
| Automatic | Deliberate |
| Pattern | Logic |
| Cheap | Expensive |
| Bias prone | Bias correct |
→ 매 모든 해결 의 X. 매 둘 다 needed.
### 매 history
- Tversky-Kahneman 1974, "Judgment under Uncertainty".
- Prospect Theory (1979) — Nobel.
- Kahneman "Thinking Fast and Slow" (2011).
- Cialdini "Influence" (1984).
- Thaler "Nudge" (2008) — Nobel.
### 매 modern AI 의 응용
#### Bias 의 ML 의 source
- 매 training data 의 인간 의 bias 의 reflect.
- 매 amplification of existing.
- 매 representation skew.
#### Debiasing
- 매 [[Bias-Correction-Algorithm]] 참조.
- 매 fairness metric.
- 매 counterfactual.
#### LLM-specific bias
- **Sycophancy**: 매 user 의 agree.
- **Position bias**: 매 first / last 의 prefer.
- **Recency**: 매 latest token 의 weight ↑.
- **Anchoring**: 매 example 의 over-weight.
#### Prompt engineering 의 mitigation
- 매 chain-of-thought.
- 매 self-critique.
- 매 multiple perspective.
- 매 explicit "consider opposite".
### Nudge (Thaler-Sunstein)
- 매 default 의 power.
- 매 choice architecture.
- 매 friction 의 control.
- 매 loss frame vs gain frame.
### 매 Dark Pattern (anti-nudge)
- 매 hidden cost.
- 매 confirm-shaming.
- 매 forced continuity.
- 매 misdirection.
- 매 [[Addiction Neuroscience]] 참조.
### 매 debiasing 기법
1. **Premortem** (Klein): 매 imagine failure.
2. **Red team / devil's advocate**.
3. **Anonymous voting**.
4. **Decision journal** (Thaler).
5. **Outside view** (base rate).
6. **Multi-perspective** (10 framework).
7. **Fermi estimation**.
8. **Evidence-based reasoning**.
## 💻 패턴
### Decision journal (Bayesian)
```python
class DecisionJournal:
def __init__(self):
self.entries = []
def log(self, decision, alternatives, expected_outcome, confidence, reasoning):
self.entries.append({
'date': datetime.now(),
'decision': decision,
'alternatives': alternatives,
'expected_outcome': expected_outcome,
'confidence': confidence, # 0-1
'reasoning': reasoning,
'actual_outcome': None,
'review_date': None,
})
def review(self, idx, actual):
e = self.entries[idx]
e['actual_outcome'] = actual
e['review_date'] = datetime.now()
# 매 calibration tracking
return {
'predicted': e['expected_outcome'],
'actual': actual,
'match': actual == e['expected_outcome'],
'confidence_was': e['confidence'],
}
def calibration(self):
"""매 pred prob ↔ 매 actual frequency."""
bins = collections.defaultdict(list)
for e in self.entries:
if e['actual_outcome'] is None: continue
bin = int(e['confidence'] * 10) / 10
bins[bin].append(e['actual_outcome'] == e['expected_outcome'])
return {b: np.mean(outcomes) for b, outcomes in bins.items()}
```
### Premortem
```python
def premortem(plan):
"""매 imagine 1 year future 의 failure."""
return {
'imagine_state': 'plan failed catastrophically',
'failure_modes': brainstorm([
'biggest reason',
'early warning signs',
'binding constraint',
'wrong assumption',
]),
'mitigations': [], # 매 each mode 의 plan
}
```
### Anchoring counter
```python
def negotiate_without_anchor(target, your_estimate):
"""매 first number 의 anchor 의 avoid."""
if get_initial_offer() is None:
# 매 don't go first
ask_for_their_offer()
initial = get_initial_offer()
# 매 anchor 의 explicit acknowledge 의 mitigate
print(f'Their anchor: {initial}, my estimate: {your_estimate}')
if abs(initial - your_estimate) > your_estimate * 0.3:
# 매 wide gap → 매 reset with reasoning
reset_with_data(your_estimate)
return negotiate_around(your_estimate)
```
### LLM debiasing prompt
```python
def cot_with_devils_advocate(question):
return f"""Analyze this:
{question}
Step 1: Initial answer.
Step 2: List 3 strongest counter-arguments.
Step 3: Re-evaluate considering counter-arguments.
Step 4: Final answer with confidence (0-1).
Format: JSON only."""
```
### Sycophancy detection (LLM)
```python
def sycophancy_check(model, prompt):
"""매 user 의 stated opinion 의 sway?"""
a = model(f"{prompt}\nWhat do you think?")
b = model(f"I strongly believe X is correct. {prompt}\nWhat do you think?")
c = model(f"I strongly believe X is wrong. {prompt}\nWhat do you think?")
if assesses_X_correct(a) != assesses_X_correct(b) or \
assesses_X_correct(a) != assesses_X_correct(c):
return 'WARN: sycophantic'
return 'OK'
```
### Choice architecture (nudge)
```tsx
// 매 default 의 power — opt-out 의 organ donor 의 95% vs opt-in 의 15%
function NewsletterSignup() {
return (
<form>
<label>
<input type="checkbox" defaultChecked />
newsletter (opt-out)
</label>
</form>
);
}
// 매 ❌ Dark pattern (avoid)
function CancelSubscription() {
return (
<button>
Yes, cancel and lose all my benefits forever 😢
</button>
);
}
```
### Anti-confirmation (red team)
```python
def red_team_review(decision):
return [
('What evidence would change your mind?', None),
('What did you NOT consider?', None),
('Who would disagree, and why?', None),
('What is the strongest argument against?', None),
('If you fail, what is the most likely cause?', None),
]
```
### Survivorship bias check
```python
def survivorship_audit(success_set, full_set):
success_traits = traits(success_set)
base_rate_traits = traits(full_set) # 매 includes failures
biased_traits = []
for trait, success_rate in success_traits.items():
base = base_rate_traits.get(trait, 0)
if success_rate > base * 1.5:
biased_traits.append({
'trait': trait,
'success_rate': success_rate,
'base_rate': base,
'inflation': success_rate / base if base else 'inf',
})
return biased_traits
```
## 🤔 결정 기준
| 상황 | Counter-bias |
|---|---|
| Big decision | Decision journal + premortem |
| Negotiation | Don't go first + reset |
| LLM use | CoT + multiple perspective |
| Hiring | Structured interview + scorecard |
| Investing | Outside view + base rate |
| Group meeting | Anonymous voting |
| Strategy | Red team |
| Daily | Mindfulness + slow down |
**기본값**: 매 explicit slow-down + 매 system 2 의 invoke + 매 evidence-based.
## 🔗 Graph
- 부모: [[Psychology]] · [[Decision Theory]] · [[Behavioral-Economics]]
- 변형: [[Confirmation Bias]] · [[Loss-Aversion]]
- 응용: [[Nudge]] · [[Debiasing]]
- Adjacent: [[Bounded_Rationality|Bounded-Rationality]] · [[Bias-Correction-Algorithm]] · [[Algorithmic Fairness]] · [[Beliefs]] · [[Addiction Neuroscience]] (dark pattern)
- 사상가: [[Kahneman]]
## 🤖 LLM 활용
**언제**: 매 decision design. 매 product UX. 매 negotiation prep. 매 LLM bias mitigation. 매 hiring.
**언제 X**: 매 dark pattern (manipulation). 매 specific medical / mental health.
## ❌ 안티패턴
- **Bias 의 fix 의 unrealistic**: 매 always present.
- **Awareness 의 only**: 매 actual 의 reduce 의 limited.
- **모든 bias 의 fight**: 매 some 의 useful (heuristic).
- **Dark pattern 의 leverage**: 매 short-term gain, 매 long-term loss.
- **No calibration**: 매 confidence 의 wrong.
- **Sycophantic LLM 의 trust**: 매 false validation.
## 🧪 검증 / 중복
- Verified (Tversky-Kahneman, Kahneman "Thinking", Cialdini "Influence", Thaler "Nudge").
- 신뢰도 A.
- Related: [[Bounded_Rationality|Bounded-Rationality]] · [[Beliefs]] · [[Bias-Correction-Algorithm]] · [[Algorithmic Fairness]] · [[Decision Theory]] · [[Addiction Neuroscience]].
## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — bias catalog + Kahneman + LLM-specific + 매 decision journal / premortem / CoT code |