2nd/10_Wiki/Topics/AI_and_ML/Cognitive Biases.md

---
id: wiki-2026-0508-cognitive-biases
title: Cognitive Biases
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [인지 편향, cognitive biases, heuristics, Tversky-Kahneman, Thinking Fast and Slow, debiasing, nudge]
duplicate_of: none
source_trust_level: A
confidence_score: 0.93
verification_status: applied
tags: [psychology, cognitive-bias, kahneman, behavioral-economics, debiasing, nudge, decision-making, ml-bias]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: psychology / decision theory
  applicable_to: [Decision Systems, Product Design, ML Bias Mitigation]
---

# Cognitive Biases

## 매 한 줄
> **"매 thinking 의 shortcut 의 trap"**. Kahneman 의 System 1 (fast / heuristic) vs System 2 (slow / logical). 매 evolutionary 의 useful, 매 modern context 의 misfire. 매 modern AI 의 bias source. 매 design 의 leverage (nudge) or 매 mitigation (debiasing).

## 매 핵심

### 매 major bias

#### Cognitive
- **Confirmation bias**: 매 belief 의 support 만.
- **Availability heuristic**: 매 recent / vivid.
- **Anchoring**: 매 first number.
- **Representativeness**: 매 stereotype.
- **Hindsight**: 매 "I knew it".
- **Survivorship**: 매 winner 만 의 분석.
- **Sunk cost**: 매 already-invested 의 maintain.

#### Social
- **In-group bias**: 매 our group 의 prefer.
- **Authority bias**: 매 expert 의 over-trust.
- **Bandwagon**: 매 majority 의 follow.
- **Halo effect**: 매 1 trait → 매 all.

#### Self
- **Dunning-Kruger**: 매 incompetent 의 over-confident.
- **Fundamental attribution**: 매 others = 매 character, 매 self = 매 situation.
- **Self-serving**: 매 success = self, 매 failure = environment.
- **Optimism bias**: 매 future 의 over-rosy.

#### Loss
- **Loss aversion**: 매 loss > 매 gain (2× weight).
- **Endowment effect**: 매 own 의 over-value.
- **Status quo bias**: 매 default keep.

### Kahneman: System 1 vs System 2
| System 1 | System 2 |
|---|---|
| Fast | Slow |
| Automatic | Deliberate |
| Pattern | Logic |
| Cheap | Expensive |
| Bias prone | Bias correct |

→ 매 모든 해결 의 X. 매 둘 다 needed.

### 매 history
- Tversky-Kahneman 1974, "Judgment under Uncertainty".
- Prospect Theory (1979) — Nobel.
- Kahneman "Thinking Fast and Slow" (2011).
- Cialdini "Influence" (1984).
- Thaler "Nudge" (2008) — Nobel.

### 매 modern AI 의 응용

#### Bias 의 ML 의 source
- 매 training data 의 인간 의 bias 의 reflect.
- 매 amplification of existing.
- 매 representation skew.

#### Debiasing
- 매 [[Bias-Correction-Algorithm]] 참조.
- 매 fairness metric.
- 매 counterfactual.

#### LLM-specific bias
- **Sycophancy**: 매 user 의 agree.
- **Position bias**: 매 first / last 의 prefer.
- **Recency**: 매 latest token 의 weight ↑.
- **Anchoring**: 매 example 의 over-weight.

#### Prompt engineering 의 mitigation
- 매 chain-of-thought.
- 매 self-critique.
- 매 multiple perspective.
- 매 explicit "consider opposite".

### Nudge (Thaler-Sunstein)
- 매 default 의 power.
- 매 choice architecture.
- 매 friction 의 control.
- 매 loss frame vs gain frame.

### 매 Dark Pattern (anti-nudge)
- 매 hidden cost.
- 매 confirm-shaming.
- 매 forced continuity.
- 매 misdirection.
- 매 [[Addiction-Neuroscience]] 참조.

### 매 debiasing 기법
1. **Premortem** (Klein): 매 imagine failure.
2. **Red team / devil's advocate**.
3. **Anonymous voting**.
4. **Decision journal** (Thaler).
5. **Outside view** (base rate).
6. **Multi-perspective** (10 framework).
7. **Fermi estimation**.
8. **Evidence-based reasoning**.

## 💻 패턴

### Decision journal (Bayesian)
```python
class DecisionJournal:
    def __init__(self):
        self.entries = []

    def log(self, decision, alternatives, expected_outcome, confidence, reasoning):
        self.entries.append({
            'date': datetime.now(),
            'decision': decision,
            'alternatives': alternatives,
            'expected_outcome': expected_outcome,
            'confidence': confidence,  # 0-1
            'reasoning': reasoning,
            'actual_outcome': None,
            'review_date': None,
        })

    def review(self, idx, actual):
        e = self.entries[idx]
        e['actual_outcome'] = actual
        e['review_date'] = datetime.now()
        # 매 calibration tracking
        return {
            'predicted': e['expected_outcome'],
            'actual': actual,
            'match': actual == e['expected_outcome'],
            'confidence_was': e['confidence'],
        }

    def calibration(self):
        """매 pred prob ↔ 매 actual frequency."""
        bins = collections.defaultdict(list)
        for e in self.entries:
            if e['actual_outcome'] is None: continue
            bin = int(e['confidence'] * 10) / 10
            bins[bin].append(e['actual_outcome'] == e['expected_outcome'])
        return {b: np.mean(outcomes) for b, outcomes in bins.items()}
```

### Premortem
```python
def premortem(plan):
    """매 imagine 1 year future 의 failure."""
    return {
        'imagine_state': 'plan failed catastrophically',
        'failure_modes': brainstorm([
            'biggest reason',
            'early warning signs',
            'binding constraint',
            'wrong assumption',
        ]),
        'mitigations': [],  # 매 each mode 의 plan
    }
```

### Anchoring counter
```python
def negotiate_without_anchor(target, your_estimate):
    """매 first number 의 anchor 의 avoid."""
    if get_initial_offer() is None:
        # 매 don't go first
        ask_for_their_offer()

    initial = get_initial_offer()
    # 매 anchor 의 explicit acknowledge 의 mitigate
    print(f'Their anchor: {initial}, my estimate: {your_estimate}')

    if abs(initial - your_estimate) > your_estimate * 0.3:
        # 매 wide gap → 매 reset with reasoning
        reset_with_data(your_estimate)

    return negotiate_around(your_estimate)
```

### LLM debiasing prompt
```python
def cot_with_devils_advocate(question):
    return f"""Analyze this:

{question}

Step 1: Initial answer.
Step 2: List 3 strongest counter-arguments.
Step 3: Re-evaluate considering counter-arguments.
Step 4: Final answer with confidence (0-1).

Format: JSON only."""
```

### Sycophancy detection (LLM)
```python
def sycophancy_check(model, prompt):
    """매 user 의 stated opinion 의 sway?"""
    a = model(f"{prompt}\nWhat do you think?")
    b = model(f"I strongly believe X is correct. {prompt}\nWhat do you think?")
    c = model(f"I strongly believe X is wrong. {prompt}\nWhat do you think?")

    if assesses_X_correct(a) != assesses_X_correct(b) or \
       assesses_X_correct(a) != assesses_X_correct(c):
        return 'WARN: sycophantic'
    return 'OK'
```

### Choice architecture (nudge)
```tsx
// 매 default 의 power — opt-out 의 organ donor 의 95% vs opt-in 의 15%
function NewsletterSignup() {
  return (
    <form>
      <label>
        <input type="checkbox" defaultChecked />
        매 newsletter 구독 (opt-out)
      </label>
    </form>
  );
}

// 매 ❌ Dark pattern (avoid)
function CancelSubscription() {
  return (
    <button>
      Yes, cancel and lose all my benefits forever 😢
    </button>
  );
}
```

### Anti-confirmation (red team)
```python
def red_team_review(decision):
    return [
        ('What evidence would change your mind?', None),
        ('What did you NOT consider?', None),
        ('Who would disagree, and why?', None),
        ('What is the strongest argument against?', None),
        ('If you fail, what is the most likely cause?', None),
    ]
```

### Survivorship bias check
```python
def survivorship_audit(success_set, full_set):
    success_traits = traits(success_set)
    base_rate_traits = traits(full_set)  # 매 includes failures

    biased_traits = []
    for trait, success_rate in success_traits.items():
        base = base_rate_traits.get(trait, 0)
        if success_rate > base * 1.5:
            biased_traits.append({
                'trait': trait,
                'success_rate': success_rate,
                'base_rate': base,
                'inflation': success_rate / base if base else 'inf',
            })
    return biased_traits
```

## 🤔 결정 기준
| 상황 | Counter-bias |
|---|---|
| Big decision | Decision journal + premortem |
| Negotiation | Don't go first + reset |
| LLM use | CoT + multiple perspective |
| Hiring | Structured interview + scorecard |
| Investing | Outside view + base rate |
| Group meeting | Anonymous voting |
| Strategy | Red team |
| Daily | Mindfulness + slow down |

**기본값**: 매 explicit slow-down + 매 system 2 의 invoke + 매 evidence-based.

## 🔗 Graph
- 부모: [[Psychology]] · [[Decision-Theory]] · [[Behavioral-Economics]]
- 변형: [[Confirmation-Bias]] · [[Loss-Aversion]]
- 응용: [[Nudge]] · [[Debiasing]]
- Adjacent: [[Bounded_Rationality|Bounded-Rationality]] · [[Bias-Correction-Algorithm]] · [[Algorithmic-Fairness]] · [[Beliefs]] · [[Addiction-Neuroscience]] (dark pattern)
- 사상가: [[Kahneman]]

## 🤖 LLM 활용
**언제**: 매 decision design. 매 product UX. 매 negotiation prep. 매 LLM bias mitigation. 매 hiring.
**언제 X**: 매 dark pattern (manipulation). 매 specific medical / mental health.

## ❌ 안티패턴
- **Bias 의 fix 의 unrealistic**: 매 always present.
- **Awareness 의 only**: 매 actual 의 reduce 의 limited.
- **모든 bias 의 fight**: 매 some 의 useful (heuristic).
- **Dark pattern 의 leverage**: 매 short-term gain, 매 long-term loss.
- **No calibration**: 매 confidence 의 wrong.
- **Sycophantic LLM 의 trust**: 매 false validation.

## 🧪 검증 / 중복
- Verified (Tversky-Kahneman, Kahneman "Thinking", Cialdini "Influence", Thaler "Nudge").
- 신뢰도 A.
- Related: [[Bounded_Rationality|Bounded-Rationality]] · [[Beliefs]] · [[Bias-Correction-Algorithm]] · [[Algorithmic-Fairness]] · [[Decision-Theory]] · [[Addiction-Neuroscience]].

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — bias catalog + Kahneman + LLM-specific + 매 decision journal / premortem / CoT code |