Files
2nd/10_Wiki/Topics/AI_and_ML/Ad-hoc-Hypotheses.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

6.6 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-ad-hoc-hypotheses Ad-hoc Hypotheses 10_Wiki/Topics verified self
Ad Hoc Rescue
Auxiliary Hypothesis
Epicycle
Post-hoc Rationalization
none A 0.9 applied
philosophy-of-science
epistemology
falsifiability
popper
ml-debugging
2026-05-10 pending
language framework
N/A scientific method

Ad-hoc Hypotheses

매 한 줄

"매 falsified theory 의 rescue 의 위해 의 unprincipled patch". Ad-hoc hypothesis = 매 prediction 의 fail 후 의 theory 의 save 의 위해 의 added auxiliary assumption — 매 independent test 의 X + 매 explanatory power 의 add 의 X. Popper (1934, 1963) 의 매 demarcation line — 매 science 의 pseudoscience 의 separate. 매 ML/agent debugging 의 modern equivalent — 매 "magic constant + retry" 의 fix.

매 핵심

매 Popper 의 criterion

  • Bad ad-hoc: 매 theory 의 only refutation 의 block 의 위해 의 added — 매 new prediction 의 X.
  • Acceptable auxiliary: 매 independent testable consequence 의 generate.
  • 매 example: Neptune 의 prediction (Le Verrier 1846) 의 acceptable — 매 independently observed.
  • 매 example: Vulcan 의 prediction (Mercury orbit) 의 ad-hoc rescue — 매 GR 의 actually fix.

매 Lakatos 의 refinement

  • Progressive program: 매 auxiliary 의 novel fact 의 predict + corroborated.
  • Degenerative program: 매 auxiliary 의 only protect — 매 abandon.

매 modern science 의 example

  • Phlogiston + 매 negative mass 의 rescue (calx 의 weight gain).
  • Geocentric + 매 epicycle stack — 매 Copernicus 까지.
  • Cold fusion (Fleischmann-Pons 1989) + 매 unreproducibility excuse.
  • Bem's psi + 매 publication bias correction (Wiseman 의 critique).
  • String theory landscape (debated) + anthropic 의 multiverse 의 rescue.

매 ML / agent 의 modern parallel

  • Magic constant: temperature=0.7 의 work-when-it-works.
  • Retry-on-fail: 매 root cause 의 X.
  • Prompt patching: "you MUST X" 의 stack.
  • Eval cherry-pick: 매 fail case 의 carve out.
  • Benchmark contamination excuse: 매 leak 의 always blame.
  • Hyperparameter stew: 매 새 result 의 매 dataset-specific tweak.

매 응용 (red flag detection)

  1. Code review — 매 sleep(n) workaround.
  2. ML eval — 매 fail mode 의 selectively excluded.
  3. Theory paper — 매 rebuttal 의 only auxiliary 의 add.
  4. Agent debugging — 매 prompt 의 mystery instruction 의 keep accumulating.
  5. Postmortem — 매 root cause 의 X — 매 monitoring patch only.

💻 패턴

Refactor pattern: ad-hoc → principled

# 매 ad-hoc — magic retry
def call_api(x):
    for _ in range(3):
        try: return api(x)
        except: time.sleep(0.5)  # 매 why 0.5? why 3?

# 매 principled — explicit failure model
def call_api(x):
    return tenacity.retry(
        retry=retry_if_exception_type(httpx.TimeoutException),  # 매 specific
        stop=stop_after_attempt(3),
        wait=wait_exponential_jitter(initial=0.5, max=8),
        reraise=True,
    )(api)(x)

Eval pattern: pre-register failure modes

# 매 ad-hoc anti — "we exclude the cases where it fails"
# 매 principled — pre-register exclusion criteria BEFORE running eval
exclusion:
  - reason: "image >10MB (out of context window)"
    expected_count: ~3%
  - reason: "prompt 의 non-English (model 의 trained English-only)"
    expected_count: ~5%
# 매 post-hoc 의 add 의 X — 매 protocol violation 의 됨

Hypothesis-driven debug

1. Hypothesis: "X causes Y because Z"
2. Independent prediction: "if H true, then we'd see W"
3. Run test that COULD falsify H
4. If H survives + W observed → progressive
5. If H survives only by adding "...except in case Q" → ad-hoc, drop H

Prompt 의 ad-hoc accumulation 의 detect

# 매 prompt 의 length 의 grow + per-rule justification 의 missing
SYSTEM = """You are an assistant. ...
- DO NOT use bullet points       # added 2024-03 — 매 why?
- ALWAYS confirm before deleting # added 2024-05 — 매 specific incident?
- NEVER mention OpenAI           # added 2024-08 — 매 still relevant?
- output JSON ONLY               # added 2024-12 — 매 conflict 의 line 1?
"""
# 매 audit 의 quarterly + 매 each rule 의 origin + still-needed 의 verify.

Falsifiability test (theory health check)

def falsifiability_score(theory: str) -> dict:
    return {
        "predictions": [...],          # 매 list explicit
        "what_would_falsify": [...],   # 매 must be non-empty
        "novel_predictions_made": int, # 매 progressive: >0
        "rescues_added": int,          # 매 degenerative if >> novel
    }

매 결정 기준

상황 Action
Theory survives only by adding excuse Drop theory or restructure
Auxiliary 의 independent test 의 generate Acceptable, test it
ML model 의 fail case 의 patch 의 escalating Rebuild architecture
Prompt 의 100+ rule Audit + collapse + redesign
Postmortem "we'll add monitoring" only Insufficient — 매 root cause 요구
Reviewer asks tough question 매 answer with new prediction, not new excuse

기본값: 매 each auxiliary 의 "what NEW would this predict?" 의 ask. 매 None — 매 ad-hoc.

🔗 Graph

🤖 LLM 활용

언제: 매 prompt audit, 매 paper reviewer 의 ad-hoc rescue 의 detect, 매 debugging journal 의 retro. 언제 X: 매 LLM 의 ad-hoc judgment 의 alone trust — 매 human pre-reg + protocol 의 still required.

안티패턴

  • Save-the-theory-at-all-cost: 매 auxiliary 의 stack — 매 epicycle pattern.
  • Selective failure exclusion: 매 post-hoc 의 fail case 의 carve.
  • Magic-constant patching: 매 root cause 의 X.
  • Promise-then-defer: "we'll explain Q in future work" — 매 indefinite ad-hoc deferral.
  • Conspiracy-style rescue: 매 every counter-evidence 의 "the establishment 의 suppress" 의 attribute.

🧪 검증 / 중복

  • Verified (Popper Logic of Scientific Discovery 1934, Conjectures and Refutations 1963; Lakatos Methodology of Scientific Research Programmes 1978; Sober Core Questions in Philosophy).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — Popper/Lakatos + ML/agent modern parallel