--- id: [[P-Reinforce|P-Reinforce]]-AUTO-HALL-001 category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 0.98 tags: [auto-reinforced, hallucination, llm-issue, ai-safety, fact-checking, [[Alignment|Alignment]]] last_reinforced: 2026-04-20 --- # [[Hallucination (α„’α…ͺᆫᄀᅑᆨ)|Hallucination (ν™˜κ°)]] ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "ν™•λ₯ μ΄ λ§Œλ“  κ·ΈλŸ΄μ‹Έν•œ 거짓말: AIκ°€ λ°©λŒ€ν•œ λ°μ΄ν„°μ˜ 톡계적 νŒ¨ν„΄μ—λ§Œ λ§€λͺ°λ˜μ–΄, 사싀 여뢀와 상관없이 λ¬Έλ²•μ μœΌλ‘œ μ™„λ²½ν•˜κ³  섀득λ ₯ μžˆλŠ” κ°€μ§œ 정보λ₯Ό μƒμ„±ν•˜μ—¬ μ‚¬μš©μžμ—κ²Œ μ‹¬κ°ν•œ ν˜Όλž€μ„ μ£ΌλŠ” μ§€λŠ₯의 λΆ€μž‘μš©." ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) ν™˜κ°(Hallucination)은 κ±°λŒ€ μ–Έμ–΄ λͺ¨λΈ(LLM)이 μ‘΄μž¬ν•˜μ§€ μ•ŠλŠ” μ‚¬μ‹€μ΄λ‚˜ 비논리적인 닡변을 μƒμ„±ν•˜λŠ” ν˜„μƒμž…λ‹ˆλ‹€. 1. **μ™œ λ°œμƒν•˜λŠ”κ°€?**: * **Probabilistic Nature**: λͺ¨λΈμ€ μ‹€μ œ 진리λ₯Ό μ•„λŠ” 게 μ•„λ‹ˆλΌ λ‹€μŒ 단어가 올 ν™•λ₯ λ§Œ 계산함. * **Confabulation**: λΆ€μ‘±ν•œ 정보λ₯Ό λ©”μš°κΈ° μœ„ν•΄ λ‡Œκ°€ 이야기λ₯Ό μ§€μ–΄λ‚΄λŠ” μΈκ°„μ˜ 심리 κΈ°μ œμ™€ μœ μ‚¬ν•¨. ([[Gestalt Psychology|Gestalt Psychology]]와 μ—°κ²°) * **Data [[Noise|Noise]]**: ν•™μŠ΅ 데이터 자체의 였λ₯˜λ‚˜ λͺ¨μˆœμ΄ 반영됨. ([[Data Cleaning Algorithms|Data Cleaning Algorithms]]의 ν•„μš”μ„±) 2. **ν•΄κ²° λ…Έλ ₯**: * **RAG (Retrieval-Augmented Generation)**: λ‹΅λ³€ μ „ μ™ΈλΆ€ 지식을 κ²€μƒ‰ν•˜μ—¬ κ·Όκ±°λ₯Ό μ œμ‹œ. * **Constitutional AI**: "λͺ¨λ₯΄λ©΄ λͺ¨λ₯Έλ‹€κ³  λ§ν•˜λΌ"λŠ” 원칙 μ£Όμž…. (Constitutional AI와 μ—°κ²°) ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (Contradictions & RL Update) - **κ³Όκ±° λ°μ΄ν„°μ™€μ˜ 좩돌**: 초기 μœ μ € 정책은 AIλ₯Ό '검색 μ—”μ§„'처럼 λ―Ώμ—ˆμœΌλ‚˜, ν™˜κ° μ •μ±…μ˜ 싀체λ₯Ό μ•Œκ²Œ 된 ν˜„λŒ€ 정책은 AI의 닡변을 λ°˜λ“œμ‹œ μž¬κ²€μ¦ν•˜λŠ” 'λΉ„νŒμ  수용 μ •μ±…'으둜 변화함(RL Update). - **μ •μ±… λ³€ν™”(RL Update)**: ν™˜κ°μ„ λ‹¨μˆœν•œ '제거 λŒ€μƒ μ •μ±…'이 μ•„λ‹Œ, μ†Œμ„€ μ°½μž‘μ΄λ‚˜ 아이디어 발꡴ 같은 '창의적 μ˜μ—­ μ •μ±…'μ—μ„œλŠ” μœ μš©ν•œ 동λ ₯으둜 ν™œμš©ν•˜λŠ” μ—­λ°œμƒμ  μ ‘κ·Ό 정책도 λŒ€λ‘λ¨. ([[Computational Creativity|Computational Creativity]]와 μ—°κ²°) ## πŸ”— 지식 μ—°κ²° (Graph) - [[Constitutional AI (α„’α…₯ᆫᄇα…₯α†Έ AI)|Constitutional AI (ν—Œλ²• AI)]], [[Computational Creativity|Computational Creativity]], [[Ethics & AI|Ethics & AI]], Self-Correction, [[Data Cleaning Algorithms|Data Cleaning Algorithms]] - **Modern Tech/Tools**: RAG, Fact-checkers, Hallucination detection [[Benchmarks|Benchmarks]] (HaluEval). ---