Files

T

Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization

10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 23:52:15 +09:00

6.1 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

Turing Test

매 한 줄

"매 machine 이 human judge 와 30% 이상의 conversation 에서 human 으로 misclassified 되면 thinking 과 indistinguishable 하다고 판정". 매 1950 Alan Turing 의 "Computing Machinery and Intelligence" 의 imitation game. 매 2024-25 GPT-4 / Claude 의 controlled study에서 human-level pass 보고 (Jones & Bergen 2024 UCSD). 매 2026 현재 Turing Test 는 capability 측정 도구로서 obsolete, Chinese Room critique + behavioral benchmark + capability evaluation 으로 대체.

매 핵심

매 original imitation game (Turing 1950)

3 players: man (A), woman (B), interrogator (C).
C asks questions in writing, must determine which is which.
A 의 task: deceive C. B 의 task: help C.
Turing's substitution: replace A with machine. Does C error rate stay same?

매 misconception (common pop interpretation)

Pop version: "machine fools human into thinking it's human."
Original: comparison of machine deception rate vs man-deceiving-as-woman rate.
Turing's prediction: by 2000, machines will pass at ~30% rate after 5min.

매 critiques

Chinese Room (Searle 1980): passing test 은 understanding 의 evidence 아님. symbol manipulation ≠ semantics.
Imitation ≠ intelligence: human deception 은 narrow task. 매 mathematical reasoning, embodiment, learning 의 미측정.
Anthropocentric: intelligence 의 sole criterion 으로 human-likeness 가정.
Gameable: tricks (typos, refuse-to-answer, emotion mimicry) 으로 pass 가능.
Judge calibration: naive judge vs expert 의 결과 wildly 다름.

매 modern empirical results

2014 "Eugene Goostman": 33% pass at Royal Society. 매 13-yr-old Ukrainian persona 가 expectation lowering 으로 controversial pass.
2023 Jannai et al. (AI21): GPT-4 fooled humans at 60% rate in 2-min chat.
2024 Jones & Bergen (UCSD): GPT-4 passed at 54% (vs human 67%, ELIZA 22%). 매 first rigorously controlled pass.
2025 multiple replications: Claude / GPT-5 의 routine human-level performance.

매 alternatives (post-Turing era)

Capability benchmarks: MMLU, HumanEval, GPQA, ARC-AGI, SWE-bench.
Coffee test (Wozniak): make coffee in unfamiliar kitchen → embodiment.
Robot college student (Goertzel): take college courses, get degree.
Lovelace Test 2.0 (Riedl): create artifact human cannot, but expert can verify.
Winograd Schema (Levesque 2011): commonsense reasoning, originally Turing-resistant.

매 응용

AI history teaching.
Philosophy of mind discussion (consciousness, understanding).
Public communication of AI capability ("does AI think?").
Capability evaluation pre-2020 (now obsolete).

💻 패턴 (eval design lessons)

Pattern 1: Modern adversarial Turing protocol

1. Recruit N judges (calibrate by demographic, expertise).
2. Each judge: 5-min interrogation, 50% human / 50% AI random.
3. Force binary verdict (no "unsure").
4. Pass criterion: AI verdict = "human" at rate ≥ control human rate − ε.
5. Pre-register hypotheses, blind judges to study purpose.

Pattern 2: Why public Turing demos mislead

- Cherry-picked transcripts.
- Naive judges (not interrogating adversarially).
- Persona tricks (child, non-native speaker, tired, distracted).
- Self-selection bias (only impressive runs shown).

Pattern 3: Capability-first eval (modern replacement)

benchmarks = [
    "MMLU",        # broad knowledge
    "HumanEval",   # code generation
    "GPQA",        # graduate-level science
    "ARC-AGI",     # abstract reasoning
    "SWE-bench",   # real software engineering
    "HLE",         # Humanity's Last Exam (2025)
]
# Pass = top-percentile human expert performance per task.

Pattern 4: Behavioral safety eval (orthogonal to Turing)

- Refusal rate on harmful prompts.
- Calibration (uncertainty matches accuracy).
- Sycophancy (agree-with-user metric).
- Honesty (TruthfulQA, FactScore).

Pattern 5: Lovelace 2.0 framework

1. Specify class C of artifacts (e.g., novel valid mathematical proof).
2. AI produces artifact a ∈ C.
3. Human expert verifies a is valid AND novel.
4. AI architect cannot explain how a was produced.
→ Tests creativity, not imitation.

매 결정 기준

목적	Eval
Historical / philosophical context	Turing Test
Capability measurement	MMLU, GPQA, HumanEval, ARC-AGI
Reasoning / novelty	Lovelace 2.0, ARC-AGI
Embodiment / general intelligence	Coffee test, robot college
Safety / alignment	RealToxicityPrompts, MLCommons AILuminate

기본값: capability + safety multi-benchmark. Turing Test 는 historical reference only.

🔗 Graph

부모: Philosophy of AI
변형: Imitation Game

🤖 LLM 활용

언제: AI history, philosophy of mind 토론, public communication. 언제 X: actual capability measurement (use modern benchmarks).

❌ 안티패턴

"GPT passed Turing → AGI": imitation ≠ general intelligence. capability gaps remain.
Naive judge eval: untrained user 의 verdict 는 systematic bias.
Single-conversation pass: 5-min snapshot. long-horizon coherence 미측정.
Persona escape hatch: "I'm a tired teenager" 으로 weakness 정당화.
Conflating with consciousness: Turing Test 는 behavior. consciousness 의 evidence 아님.

🧪 검증 / 중복

Verified (Turing 1950 "Computing Machinery and Intelligence" Mind 59; Searle 1980 "Minds, Brains, and Programs"; Jones & Bergen 2024 arxiv 2405.08007; Riedl 2014 Lovelace 2.0).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — Turing Test history + 2024 Jones-Bergen pass + modern alternatives

6.1 KiB Raw Blame History Unescape Escape