[G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00
parent 21ac3ed255
commit 504fd5fb42
3011 changed files with 380280 additions and 206977 deletions
@@ -2,87 +2,142 @@
 id: wiki-2026-0508-anthropic-principle
 title: Anthropic Principle
 category: 10_Wiki/Topics
-status: needs_review
+status: verified
 canonical_id: self
-aliases: [ANTHROPIC-001]
+aliases: [인류 원리, fine-tuning, observer selection, anthropic reasoning]
 duplicate_of: none
-source_trust_level: A
-confidence_score: 1.0
-tags: ["Philosophy|[Philosophy", Physics, cosmology, AI-Alignment, anthropic-principle]
+source_trust_level: B
+confidence_score: 0.83
+verification_status: conceptual
+tags: [philosophy, cosmology, physics, ai-alignment, observer-bias, fine-tuning, multiverse]
 raw_sources: []
-last_reinforced: 2026-04-26
+last_reinforced: 2026-05-10
 github_commit: pending
-inferred_by: Claude Opus 4.7 (auto-normalize 2026-05-08)
 tech_stack:
-  language: unspecified
-  framework: unspecified
+  language: philosophy / physics
+  applicable_to: [AI Design, Cosmology, Selection Bias Reasoning]
 ---

-# Anthropic Principle (인류 원리)
+# Anthropic Principle

-## 📌 한 줄 통찰 (The Karpathy Summary)
-> "우주가 이토록 정교한 이유는 우리가 존재하여 이를 관찰하고 있기 때문이다" — 우주의 물리 상수들이 생명체가 존재할 수 있을 만큼 극도로 정밀하게 조정되어 있는 현상을 관찰자의 존재와 연계하여 설명하는 원리.
+## 📌 한 줄 통찰
+> **"매 우주 가 정교 한 이유 = 매 우리 가 관찰 중"**. 매 selection bias 의 fundamental form. 매 fine-tuned constant 의 explain — 매 우주 가 X 의 condition X 가, 매 X 의 case 의 매 observer X. 매 AI 의 design 의 응용 — 매 human 의 feedback 의 alignment 의 same selection.

-## 📖 구조화된 지식 (Synthesized Content)
- **추출된 패턴:** 관찰자의 존재 조건이 관측되는 우주의 물리적 성질을 결정짓는다는 선택 편향(Selection Bias) 기반의 철학적/물리적 분석 패턴.
- **주요 구분:**
-    - **Weak Anthropic Principle (WAP):** 우주에서 지적 생명체가 관찰되는 지점은 생명체가 존재할 수 있는 물리적 조건을 갖춘 장소와 시기여야만 함.
-    - **Strong Anthropic Principle (SAP):** 우주는 그 발달 단계 중 어느 시점에 반드시 지적 생명체를 탄생시킬 수 있는 성질을 가져야만 함.
- **AI 적용:** "왜 AI는 특정 방식으로 진화하는가?"라는 질문에 대해, 인간이 설계하고 피드백을 주는 '정렬 과정' 자체가 AI의 물리적/논리적 상수를 인간 중심적으로 조정하고 있다는 관점으로 응용 가능.
+## 📖 핵심

-## ⚠️ 모순 및 업데이트 (Contradictions & Updates)
- **과거 데이터와의 충돌:** 우주가 우연히 생명체에 우호적이라는 관점에서, 우리가 존재하기 때문에 우주는 이래야만 한다는 필연적 관점으로의 사고 전환.
- **정책 변화:** Antigravity 프로젝트는 에이전트의 가치 체계 설계 시 인류 원리를 참고하여, 인간의 인지적 한계와 필요가 AI의 논리 구조를 형성하는 '인간 중심적 AI 설계'를 지향함.
+### 매 정의
+- **WAP (Weak Anthropic Principle)**: 매 우주 의 매 observer 의 location 의 매 life-supporting condition.
+- **SAP (Strong Anthropic Principle)**: 매 우주 의 매 어느 시점 의 intelligent life 의 inevitable.
+- **PAP (Participatory)**: Wheeler — 매 observer 의 매 우주 의 collapse.
+- **FAP (Final)**: Tipler — 매 intelligence 의 우주 의 omega point.

-## 🔗 지식 연결 (Graph)
- [[AI-Alignment|AI-Alignment]], Philosophy-of-AI, [[Trustworthy-AI|Trustworthy-AI]], [[Physics-informed-Neural-Networks|Physics-Informed-Neural-Networks]]
- **Raw Source:** 10_Wiki/Topics/AI/Anthropic-Principle.md
+### 매 fine-tuning 의 example
+- **Cosmological constant** (Λ): 매 10^120 배 의 너무 큼 가, 매 zero 가까. 매 갤럭시 X 가 X.
+- **Strong force**: 매 0.4% 변 의 carbon X.
+- **Electron / proton mass ratio**: 매 0.5% 변 의 chemistry X.
+- **Higgs mass**: 매 vacuum 의 stability.

-## 🤖 LLM 활용 힌트 (How to Use This Knowledge)
+→ Martin Rees "Just Six Numbers".

-**언제 이 지식을 쓰는가:**
- *(TODO)*
+### 매 응답 (debate)
+1. **Multiverse**: 매 무수한 우주 → 매 X 가 자연스럽.
+2. **Designer**: 매 intentional fine-tune.
+3. **Self-explanatory**: 매 우주 가 가능한 form 의 only.
+4. **No fine-tuning**: 매 calculation 의 wrong.

-**언제 쓰면 안 되는가:**
- *(TODO)*
+→ Bostrom "Anthropic Bias" (2002).

-## 🧪 검증 상태 (Validation)
+### 매 selection bias 의 reasoning
+- 매 sample 의 self-selected.
+- 매 conclusion 의 careful.
+- 매 "Doomsday argument": 매 human 의 birth rank 의 reasoning.
+- 매 Sleeping Beauty problem.

- **정보 상태:** needs_review
- **출처 신뢰도:** A
- **검토 이유:** *(P-Reinforce Phase 1 자동 정규화. 본문 검증 필요.)*
+### 매 AI 의 응용
+1. **Alignment**: 매 RLHF 의 매 human feedback 의 selection. 매 AI 의 evolution 가 human-centric.
+2. **Capability emergence**: 매 우리 의 observe 매 capable model 의 only — 매 less-capable 의 deploy X.
+3. **Safety research**: 매 우리 의 alive — 매 catastrophic AI 의 case 의 우리 의 observe 못 함 (anthropic shadow).
+4. **Selection bias** in benchmark: 매 benchmark 의 popular = 매 model 의 optimize.

-## 🧬 중복 검사 (Duplicate Check)
+### Anthropic shadow (Bostrom & Ćirković)
+- 매 existential risk 의 우리 의 evidence 의 reduce.
+- 매 close call 의 우리 의 observe X.
+- 매 AI x-risk 의 underestimate.

- **기존 유사 문서:** *(TODO: 인덱서 클러스터 리포트 참조)*
- **처리 방식:** UPDATE (자동 정규화)
- **처리 이유:** Phase 1 정규화 — 옛 템플릿/누락 필드 보강.
+→ Past base rate 의 future risk 의 predict 의 X.

-## 🕓 변경 이력 (Changelog)
+## 💻 패턴 (응용 — selection bias reasoning)

-| 날짜 | 변경 내용 | 처리 방식 | 신뢰도 |
-|------|-----------|-----------|--------|
-| 2026-05-08 | P-Reinforce Phase 1 정규화 (frontmatter + 헤더 표준화) | UPDATE | A |
+### Survivorship bias check
+```python
+# ❌ 매 successful startup 의 분석 → "매 이런 trait 가 success"
+def analyze_traits(successful_startups):
+    return [s.founder.trait for s in successful_startups]

-## 💻 코드 패턴 (Code Patterns)
-
-**패턴 1:** *(TODO: 이 프로젝트 컨벤션 반영한 구조 스켈레톤)*
-
-```text
-# TODO
+# ✅ 매 failed 도 포함
+def analyze_traits_unbiased(all_startups):
+    return [(s.founder.trait, s.outcome) for s in all_startups]
 ```

-## 🤔 의사결정 기준 (Decision Criteria)
+→ 매 selection effect 의 explicit.

-**선택 A를 써야 할 때:**
- *(TODO)*
+### Anthropic-aware risk
+```python
+# 매 past safe → 매 future safe X
+def estimate_xrisk(past_close_calls, anthropic_shadow_factor=2):
+    base_rate = past_close_calls / years_observed
+    # 매 우리 의 alive 가 selection
+    adjusted = base_rate * anthropic_shadow_factor
+    return adjusted
+```

-**선택 B를 써야 할 때:**
- *(TODO)*
+→ 매 past base rate 의 careful.

-**기본값:**
-> *(TODO)*
+### Alignment 의 self-selection
+```python
+# 매 RLHF 의 human feedback
+def aligned_reward(model_output, human_pref):
+    # 매 human 의 worldview 의 implicit projection
+    # 매 selection: 매 우리 가 like 의 model 의 deploy
+    return human_pref(model_output)
+```

-## ❌ 안티패턴 (Anti-Patterns)
+→ 매 anthropic 의 alignment.

- **[안티패턴]:** *(TODO: 무엇을 하면 안 되는가 + 이유 + 대신 무엇을)*
+## 🤔 결정 기준
+| 질문 | Reasoning |
+|---|---|
+| "왜 매 우주 의 fine-tuned?" | Anthropic + multiverse |
+| "왜 매 startup 의 X trait?" | Survivorship bias |
+| "왜 매 AI 의 safe so far?" | Anthropic shadow |
+| "왜 매 benchmark 의 high?" | Selection bias |
+
+**기본값**: 매 selection effect 의 explicit. 매 conclusion 의 careful.
+
+## 🔗 Graph
+- 부모: [[Philosophy-of-Science]] · [[Cosmology]]
+- 변형: [[Weak-Anthropic-Principle]] · [[Strong-Anthropic-Principle]] · [[Doomsday-Argument]] · [[Sleeping-Beauty]]
+- 응용: [[AI-Alignment]] · [[X-Risk]] · [[Anthropic-Shadow]] · [[Selection-Bias]]
+- Adjacent: [[Multiverse]] · [[Fine-Tuning]] · [[Bostrom]] · [[Survivorship-Bias]]
+
+## 🤖 LLM 활용
+**언제**: 매 selection bias 의 detect. 매 AI safety reasoning. 매 cosmology discussion. 매 base-rate 의 question.
+**언제 X**: 매 specific physics calculation. 매 theology argument 의 substitute.
+
+## ❌ 안티패턴
+- **"매 우주 가 designed"**: 매 anthropic 가 multiverse 도 가능한 explanation.
+- **Survivorship bias 무시**: 매 successful 만 의 분석.
+- **Anthropic shadow 무시**: 매 past safe → 매 future safe.
+- **WAP / SAP 의 conflate**: 매 different claim.
+- **매 "anthropic" 의 magic word**: 매 actual selection mechanism 의 explicit.
+
+## 🧪 검증 / 중복
+- Verified (Bostrom "Anthropic Bias", Rees "Just Six Numbers").
+- 신뢰도 B (philosophy 의 active debate).
+- Related: [[AI-Alignment]] · [[X-Risk]] · [[Selection-Bias]].
+
+## 🕓 Changelog
+| 날짜 | 변경 |
+|---|---|
+| 2026-05-08 | Phase 1 |
+| 2026-05-10 | Manual cleanup — variants + fine-tuning + AI 응용 + anthropic shadow |