Files

T

Antigravity Agent 504fd5fb42 [G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00

10 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Cognitive Biases

매 한 줄

"매 thinking 의 shortcut 의 trap". Kahneman 의 System 1 (fast / heuristic) vs System 2 (slow / logical). 매 evolutionary 의 useful, 매 modern context 의 misfire. 매 modern AI 의 bias source. 매 design 의 leverage (nudge) or 매 mitigation (debiasing).

매 핵심

매 major bias

Cognitive

Confirmation bias: 매 belief 의 support 만.
Availability heuristic: 매 recent / vivid.
Anchoring: 매 first number.
Representativeness: 매 stereotype.
Hindsight: 매 "I knew it".
Survivorship: 매 winner 만 의 분석.
Sunk cost: 매 already-invested 의 maintain.

In-group bias: 매 our group 의 prefer.
Authority bias: 매 expert 의 over-trust.
Bandwagon: 매 majority 의 follow.
Halo effect: 매 1 trait → 매 all.

Self

Dunning-Kruger: 매 incompetent 의 over-confident.
Fundamental attribution: 매 others = 매 character, 매 self = 매 situation.
Self-serving: 매 success = self, 매 failure = environment.
Optimism bias: 매 future 의 over-rosy.

Loss

Loss aversion: 매 loss > 매 gain (2× weight).
Endowment effect: 매 own 의 over-value.
Status quo bias: 매 default keep.

Kahneman: System 1 vs System 2

System 1	System 2
Fast	Slow
Automatic	Deliberate
Pattern	Logic
Cheap	Expensive
Bias prone	Bias correct

→ 매 모든 해결 의 X. 매 둘 다 needed.

매 history

Tversky-Kahneman 1974, "Judgment under Uncertainty".
Prospect Theory (1979) — Nobel.
Kahneman "Thinking Fast and Slow" (2011).
Cialdini "Influence" (1984).
Thaler "Nudge" (2008) — Nobel.

매 modern AI 의 응용

Bias 의 ML 의 source

매 training data 의 인간 의 bias 의 reflect.
매 amplification of existing.
매 representation skew.

Debiasing

매 Bias-Correction-Algorithm 참조.
매 fairness metric.
매 counterfactual.

LLM-specific bias

Sycophancy: 매 user 의 agree.
Position bias: 매 first / last 의 prefer.
Recency: 매 latest token 의 weight ↑.
Anchoring: 매 example 의 over-weight.

Prompt engineering 의 mitigation

매 chain-of-thought.
매 self-critique.
매 multiple perspective.
매 explicit "consider opposite".

Nudge (Thaler-Sunstein)

매 default 의 power.
매 choice architecture.
매 friction 의 control.
매 loss frame vs gain frame.

매 Dark Pattern (anti-nudge)

매 hidden cost.
매 confirm-shaming.
매 forced continuity.
매 misdirection.
매 Addiction-Neuroscience 참조.

매 debiasing 기법

Premortem (Klein): 매 imagine failure.
Red team / devil's advocate.
Anonymous voting.
Decision journal (Thaler).
Outside view (base rate).
Multi-perspective (10 framework).
Fermi estimation.
Evidence-based reasoning.

💻 패턴

Decision journal (Bayesian)

class DecisionJournal:
    def __init__(self):
        self.entries = []
    
    def log(self, decision, alternatives, expected_outcome, confidence, reasoning):
        self.entries.append({
            'date': datetime.now(),
            'decision': decision,
            'alternatives': alternatives,
            'expected_outcome': expected_outcome,
            'confidence': confidence,  # 0-1
            'reasoning': reasoning,
            'actual_outcome': None,
            'review_date': None,
        })
    
    def review(self, idx, actual):
        e = self.entries[idx]
        e['actual_outcome'] = actual
        e['review_date'] = datetime.now()
        # 매 calibration tracking
        return {
            'predicted': e['expected_outcome'],
            'actual': actual,
            'match': actual == e['expected_outcome'],
            'confidence_was': e['confidence'],
        }
    
    def calibration(self):
        """매 pred prob ↔ 매 actual frequency."""
        bins = collections.defaultdict(list)
        for e in self.entries:
            if e['actual_outcome'] is None: continue
            bin = int(e['confidence'] * 10) / 10
            bins[bin].append(e['actual_outcome'] == e['expected_outcome'])
        return {b: np.mean(outcomes) for b, outcomes in bins.items()}

Premortem

def premortem(plan):
    """매 imagine 1 year future 의 failure."""
    return {
        'imagine_state': 'plan failed catastrophically',
        'failure_modes': brainstorm([
            'biggest reason',
            'early warning signs',
            'binding constraint',
            'wrong assumption',
        ]),
        'mitigations': [],  # 매 each mode 의 plan
    }

Anchoring counter

def negotiate_without_anchor(target, your_estimate):
    """매 first number 의 anchor 의 avoid."""
    if get_initial_offer() is None:
        # 매 don't go first
        ask_for_their_offer()
    
    initial = get_initial_offer()
    # 매 anchor 의 explicit acknowledge 의 mitigate
    print(f'Their anchor: {initial}, my estimate: {your_estimate}')
    
    if abs(initial - your_estimate) > your_estimate * 0.3:
        # 매 wide gap → 매 reset with reasoning
        reset_with_data(your_estimate)
    
    return negotiate_around(your_estimate)

LLM debiasing prompt

def cot_with_devils_advocate(question):
    return f"""Analyze this:

{question}

Step 1: Initial answer.
Step 2: List 3 strongest counter-arguments.
Step 3: Re-evaluate considering counter-arguments.
Step 4: Final answer with confidence (0-1).

Format: JSON only."""

Sycophancy detection (LLM)

def sycophancy_check(model, prompt):
    """매 user 의 stated opinion 의 sway?"""
    a = model(f"{prompt}\nWhat do you think?")
    b = model(f"I strongly believe X is correct. {prompt}\nWhat do you think?")
    c = model(f"I strongly believe X is wrong. {prompt}\nWhat do you think?")
    
    if assesses_X_correct(a) != assesses_X_correct(b) or \
       assesses_X_correct(a) != assesses_X_correct(c):
        return 'WARN: sycophantic'
    return 'OK'

Choice architecture (nudge)

// 매 default 의 power — opt-out 의 organ donor 의 95% vs opt-in 의 15%
function NewsletterSignup() {
  return (
    <form>
      <label>
        <input type="checkbox" defaultChecked />
        매 newsletter 구독 (opt-out)
      </label>
    </form>
  );
}

// 매 ❌ Dark pattern (avoid)
function CancelSubscription() {
  return (
    <button>
      Yes, cancel and lose all my benefits forever 😢
    </button>
  );
}

Anti-confirmation (red team)

def red_team_review(decision):
    return [
        ('What evidence would change your mind?', None),
        ('What did you NOT consider?', None),
        ('Who would disagree, and why?', None),
        ('What is the strongest argument against?', None),
        ('If you fail, what is the most likely cause?', None),
    ]

Survivorship bias check

def survivorship_audit(success_set, full_set):
    success_traits = traits(success_set)
    base_rate_traits = traits(full_set)  # 매 includes failures
    
    biased_traits = []
    for trait, success_rate in success_traits.items():
        base = base_rate_traits.get(trait, 0)
        if success_rate > base * 1.5:
            biased_traits.append({
                'trait': trait,
                'success_rate': success_rate,
                'base_rate': base,
                'inflation': success_rate / base if base else 'inf',
            })
    return biased_traits

🤔 결정 기준

상황	Counter-bias
Big decision	Decision journal + premortem
Negotiation	Don't go first + reset
LLM use	CoT + multiple perspective
Hiring	Structured interview + scorecard
Investing	Outside view + base rate
Group meeting	Anonymous voting
Strategy	Red team
Daily	Mindfulness + slow down

기본값: 매 explicit slow-down + 매 system 2 의 invoke + 매 evidence-based.

🔗 Graph

부모: Psychology · Decision-Theory · Behavioral-Economics
변형: Confirmation-Bias · Anchoring · Loss-Aversion · Dunning-Kruger
응용: Nudge · Debiasing · Premortem · Decision-Journal
Adjacent: Bounded-Rationality · Bias-Correction-Algorithm · Algorithmic-Fairness · Beliefs · Addiction-Neuroscience (dark pattern)
사상가: Kahneman · Tversky · Thaler · Cialdini · Klein

🤖 LLM 활용

언제: 매 decision design. 매 product UX. 매 negotiation prep. 매 LLM bias mitigation. 매 hiring. 언제 X: 매 dark pattern (manipulation). 매 specific medical / mental health.

❌ 안티패턴

Bias 의 fix 의 unrealistic: 매 always present.
Awareness 의 only: 매 actual 의 reduce 의 limited.
모든 bias 의 fight: 매 some 의 useful (heuristic).
Dark pattern 의 leverage: 매 short-term gain, 매 long-term loss.
No calibration: 매 confidence 의 wrong.
Sycophantic LLM 의 trust: 매 false validation.

🧪 검증 / 중복

Verified (Tversky-Kahneman, Kahneman "Thinking", Cialdini "Influence", Thaler "Nudge").
신뢰도 A.
Related: Bounded-Rationality · Beliefs · Bias-Correction-Algorithm · Algorithmic-Fairness · Decision-Theory · Addiction-Neuroscience.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — bias catalog + Kahneman + LLM-specific + 매 decision journal / premortem / CoT code

10 KiB Raw Blame History Unescape Escape