--- id: wiki-2026-0508-automated-theorem-proving title: Automated Theorem Proving (ATP) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [ATP, formal verification, theorem prover, Lean, Coq, Isabelle, proof assistant, neuro-symbolic] duplicate_of: none source_trust_level: A confidence_score: 0.92 verification_status: applied tags: [theorem-proving, formal-verification, lean, coq, smt, sat-solver, neuro-symbolic, math-ai] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Lean / Coq / Isabelle / Z3 framework: Lean 4 / Mathlib / TLA+ --- # Automated Theorem Proving (ATP) ## 📌 한 줄 통찰 > **"매 'works' 의 'cannot fail' 의 강제 proof"**. 매 software trust 의 정점. 매 seL4 / Compcert / Ethereum smart contract / aerospace 의 underlying. 매 LLM + Lean 의 fusion 의 매 IMO gold (DeepMind AlphaProof 2024) 의 milestone. ## 📖 핵심 ### 매 spectrum | 종류 | Automation | 매 사용처 | |---|---|---| | SAT solver | full | 매 boolean satisfiability | | SMT solver | full | 매 program verification | | Proof assistant (interactive) | partial | 매 deep math + OS | | Auto-tactic + LLM | hybrid | 매 modern (AlphaProof) | ### 매 tool #### SAT/SMT - **Z3** (Microsoft): 매 SMT, 매 hardware verify. - **CVC5**, **Yices**: SMT 의 alternative. - **MiniSat / Glucose**: SAT. #### Interactive proof assistant - **Lean 4** (Microsoft Research): 매 modern. 매 Mathlib. - **Coq**: 매 OG. 매 CompCert / Software Foundations. - **Isabelle/HOL**: 매 seL4 verify. - **Agda**: 매 dependent type. - **F\***: 매 program 의 spec. #### Specification language - **TLA+** (Lamport): 매 distributed system. - **Alloy**: 매 model finder. - **Dafny**: 매 functional spec. ### Landmark - **seL4** (NICTA, 2009): 매 OS kernel 의 functional correctness proof. - **CompCert** (INRIA): 매 verified C compiler. - **AWS s2n / Encryption SDK**: 매 crypto library. - **CertiKOS**: 매 OS kernel. - **AlphaProof / AlphaGeometry** (DeepMind 2024): 매 IMO silver / gold. - **Lean 4 Mathlib**: 매 1M line of formal math. ### 매 process (proof assistant) 1. **Specification**: 매 property 의 formal statement. 2. **Proof script**: 매 tactic 의 sequence. 3. **Type check**: 매 kernel 의 verify. 4. **Tactic**: 매 simp, rewrite, induction, ring, nlinarith, ... ### 매 LLM 의 결합 - **GPT-f / Lean-Gym** (OpenAI): 매 tactic prediction. - **AlphaProof**: 매 self-play + Lean 4. - **DeepSeek-Prover**: 매 open-source. - **MiniF2F benchmark**: 매 high-school math. → 매 AI + symbolic verify 의 best of both. ### 매 수학 의 응용 - 매 Kepler conjecture (Hales, Coq). - 매 Four color theorem (Coq). - 매 Liquid tensor experiment (Lean, Scholze). - 매 Polynomial Freiman-Ruzsa (Tao 2024, Lean Mathlib). ### 매 software 의 응용 - **Compiler**: CompCert. - **OS kernel**: seL4. - **Crypto**: HACL\*. - **Smart contract**: Move (Aptos), K framework. - **Distributed**: TLA+ (AWS, Azure). - **Hardware**: Intel, ARM 의 formal verify. ### 매 limitation - **Specification effort**: 매 spec 의 write 가 매 expensive. - **Computability**: 매 undecidable problem 도 있음. - **Scaling**: 매 large code 의 effort 폭발. - **Maintenance**: 매 spec change → 매 proof update. ## 💻 패턴 ### Lean 4 (basic) ```lean -- 매 definition def factorial : Nat → Nat | 0 => 1 | n + 1 => (n + 1) * factorial n -- 매 theorem theorem factorial_pos : ∀ n, factorial n > 0 := by intro n induction n with | zero => simp [factorial] | succ k ih => simp [factorial] exact Nat.mul_pos (Nat.succ_pos k) ih ``` ### Coq (Software Foundations) ```coq Inductive nat : Type := | O : nat | S : nat -> nat. Fixpoint plus (n m : nat) : nat := match n with | O => m | S n' => S (plus n' m) end. Theorem plus_O_n : forall n : nat, plus O n = n. Proof. intro n. simpl. reflexivity. Qed. ``` ### Z3 SMT ```python from z3 import * x = Int('x') y = Int('y') solver = Solver() solver.add(x + y == 10) solver.add(x > 0, y > 0) solver.add(x * y == 21) if solver.check() == sat: print(solver.model()) # 매 [x = 7, y = 3] or [x = 3, y = 7] ``` ### TLA+ (distributed protocol) ```tla EXTENDS Naturals, Sequences VARIABLES queue, processed Init == queue = <<>> /\ processed = {} Enqueue(item) == /\ queue' = Append(queue, item) /\ UNCHANGED processed Dequeue == /\ Len(queue) > 0 /\ processed' = processed \cup {Head(queue)} /\ queue' = Tail(queue) Spec == Init /\ [][Enqueue \/ Dequeue]_<> Invariant == \A x \in processed : x \notin Range(queue) ``` ### LLM tactic prediction (Lean-Gym style) ```python from lean_gym import LeanGym gym = LeanGym() gym.start_proof(theorem="factorial_pos") while not gym.is_complete(): state = gym.current_goal() tactic = llm.predict_tactic(state) # 매 LLM 의 propose success = gym.apply_tactic(tactic) if not success: tactic = llm.regenerate(state, blacklist=[tactic]) ``` ## 🤔 결정 기준 | 문제 | Tool | |---|---| | Smart contract | Move / K / Dafny | | OS kernel | Isabelle / Coq | | Compiler | Coq (CompCert) | | Crypto | F\* (HACL\*) | | Distributed protocol | TLA+ | | Hardware | SystemVerilog + formal | | Math research | Lean 4 + Mathlib | | SAT/SMT | Z3 | **기본값**: 매 critical = Lean / Coq + Z3. 매 distributed = TLA+. 매 smart contract = K / Move. ## 🔗 Graph - 부모: [[Formal Methods]] · [[Logic]] · [[Type Theory]] - 변형: [[Proof-Assistant]] · [[Model-Checking]] - 응용: [[CompCert]] - AI hybrid: [[Neural-Symbolic-Integration|Neuro-Symbolic-AI]] - Adjacent: [[Curry-Howard]] ## 🤖 LLM 활용 **언제**: 매 critical software (kernel, crypto, smart contract). 매 distributed protocol. 매 deep math. 매 ATP-LLM hybrid 의 research. **언제 X**: 매 normal app (cost overrun). 매 spec 의 unclear (proof 의 X). ## ❌ 안티패턴 - **Spec 없이 prove**: 매 wrong thing 의 prove. - **모든 code 의 prove**: 매 ROI X. - **Lean 의 proof 의 LLM 의 generate without check**: 매 fake. - **Spec 의 too weak**: 매 trivial proof. - **No maintenance**: 매 bit-rot. - **Tool lock-in**: 매 ecosystem 의 lose. ## 🧪 검증 / 중복 - Verified (seL4, CompCert, Lean Mathlib, AlphaProof papers). - 신뢰도 A. - Related: [[Lean-4]] · [[Coq]] · [[TLA-Plus]] · [[Neural-Symbolic-Integration|Neuro-Symbolic-AI]] · [[AlphaProof]]. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — tool spectrum + landmark + LLM hybrid + 매 Lean / Coq / Z3 / TLA+ code |