--- id: P-REINFORCE-AI-TEST-TIME-COMPUTE category: "10_Wiki/πŸ’‘ Topics/AI" confidence_score: 0.97 tags: [LLM, Inference, Scale, OpenAI-o1] last_reinforced: 2026-04-20 --- # [[Test-Time Compute Scaling (μΆ”λ‘  μ‹œκ°„ 계산 μŠ€μΌ€μΌλ§)]] ## πŸ“Œ ν•œ 쀄 톡찰 (The Karpathy Summary) > "λͺ¨λΈμ΄ 크지 μ•Šμ•„λ„, 더 였래 μƒκ°ν•˜κ²Œ ν•˜λ©΄ 더 λ˜‘λ˜‘ν•΄μ§„λ‹€." ν›ˆλ ¨ λ‹¨κ³„μ˜ μŠ€μΌ€μΌλ§μ„ λ„˜μ–΄, μΆ”λ‘ (Inference) μ‹œμ— 더 λ§Žμ€ μ—°μ‚° μžμ›(사고 단계)을 νˆ¬μž…ν•˜μ—¬ μ •λ‹΅λ₯ μ„ λ†’μ΄λŠ” μƒˆλ‘œμš΄ νŒ¨λŸ¬λ‹€μž„μ΄λ‹€. ## πŸ“– κ΅¬μ‘°ν™”λœ 지식 (Synthesized Content) - **The Concept**: - κΈ°μ‘΄μ—λŠ” λͺ¨λΈμ˜ 크기(νŒŒλΌλ―Έν„° 수)κ°€ μ§€λŠ₯을 κ²°μ •ν•œλ‹€κ³  λ―Ώμ—ˆμœΌλ‚˜, OpenAI o1 λ“± μ΅œμ‹  λͺ¨λΈμ€ λ‹΅λ³€ μ „ 'Self-Correction'κ³Ό μΆ”λ‘  과정을 λŠ˜λ¦¬λŠ” κ²ƒλ§ŒμœΌλ‘œλ„ κ±°λŒ€ λͺ¨λΈμ„ 압도할 수 μžˆμŒμ„ 증λͺ…함. - **Methods**: - **Chain-of-Thought (CoT)**: 쀑간 과정을 길게 생성. - **Search (MCTS)**: μ—¬λŸ¬ λŒ€μ•ˆ 닡변을 νƒμƒ‰ν•˜κ³  ν‰κ°€ν•˜μ—¬ 졜적의 경둜 선택. - **Verification**: μƒμ„±λœ κ²°κ³Όλ₯Ό 슀슀둜 κ²€μ¦ν•˜κ³  ν‹€λ ΈμœΌλ©΄ λ‹€μ‹œ μ‹œλ„. - **Inference Law**: ν›ˆλ ¨ μ‹œ μžμ›μ΄ 뢀쑱해도 μΆ”λ‘  μ‹œ κ³„μ‚°λŸ‰μ„ 늘림으둜써 μ„±λŠ₯ ν•œκ³„λ₯Ό λŒνŒŒν•  수 μžˆλ‹€. ## ⚠️ λͺ¨μˆœ 및 μ—…λ°μ΄νŠΈ (RL Update) - μΆ”λ‘  μ‹œκ°„ κ³„μ‚°λŸ‰μ΄ λŠ˜μ–΄λ‚˜λ©΄ λΉ„μš©(Latency)이 κΈ°ν•˜κΈ‰μˆ˜μ μœΌλ‘œ μ¦κ°€ν•œλ‹€. μ‹€μ‹œκ°„ μ±„νŒ…μ—λŠ” 뢀적합할 수 μžˆμœΌλ―€λ‘œ, 'λΉ λ₯Έ 직관(System 1)'κ³Ό 'μ‹ μ€‘ν•œ 사고(System 2)'λ₯Ό κ΅¬λΆ„ν•˜μ—¬ 과제 λ‚œμ΄λ„μ— 따라 μžμ›μ„ λ°°λΆ„ν•˜λŠ” νš¨μœ¨ν™”κ°€ 핡심 κ³Όμ œλ‹€. ## πŸ”— 지식 μ—°κ²° (Graph) - Related: [[Chain-of-Thought (CoT 사고 μ‚¬μŠ¬)]] , Monte Carlo Tree Search (MCTS) - Origin: OpenAI-o1