Files
2nd/10_Wiki/Topics/AI_and_ML/Automated-Theorem-Proving.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

6.4 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-automated-theorem-proving Automated Theorem Proving (ATP) 10_Wiki/Topics verified self
ATP
formal verification
theorem prover
Lean
Coq
Isabelle
proof assistant
neuro-symbolic
none A 0.92 applied
theorem-proving
formal-verification
lean
coq
smt
sat-solver
neuro-symbolic
math-ai
2026-05-10 pending
language framework
Lean / Coq / Isabelle / Z3 Lean 4 / Mathlib / TLA+

Automated Theorem Proving (ATP)

📌 한 줄 통찰

"매 'works' 의 'cannot fail' 의 강제 proof". 매 software trust 의 정점. 매 seL4 / Compcert / Ethereum smart contract / aerospace 의 underlying. 매 LLM + Lean 의 fusion 의 매 IMO gold (DeepMind AlphaProof 2024) 의 milestone.

📖 핵심

매 spectrum

종류 Automation 매 사용처
SAT solver full 매 boolean satisfiability
SMT solver full 매 program verification
Proof assistant (interactive) partial 매 deep math + OS
Auto-tactic + LLM hybrid 매 modern (AlphaProof)

매 tool

SAT/SMT

  • Z3 (Microsoft): 매 SMT, 매 hardware verify.
  • CVC5, Yices: SMT 의 alternative.
  • MiniSat / Glucose: SAT.

Interactive proof assistant

  • Lean 4 (Microsoft Research): 매 modern. 매 Mathlib.
  • Coq: 매 OG. 매 CompCert / Software Foundations.
  • Isabelle/HOL: 매 seL4 verify.
  • Agda: 매 dependent type.
  • F*: 매 program 의 spec.

Specification language

  • TLA+ (Lamport): 매 distributed system.
  • Alloy: 매 model finder.
  • Dafny: 매 functional spec.

Landmark

  • seL4 (NICTA, 2009): 매 OS kernel 의 functional correctness proof.
  • CompCert (INRIA): 매 verified C compiler.
  • AWS s2n / Encryption SDK: 매 crypto library.
  • CertiKOS: 매 OS kernel.
  • AlphaProof / AlphaGeometry (DeepMind 2024): 매 IMO silver / gold.
  • Lean 4 Mathlib: 매 1M line of formal math.

매 process (proof assistant)

  1. Specification: 매 property 의 formal statement.
  2. Proof script: 매 tactic 의 sequence.
  3. Type check: 매 kernel 의 verify.
  4. Tactic: 매 simp, rewrite, induction, ring, nlinarith, ...

매 LLM 의 결합

  • GPT-f / Lean-Gym (OpenAI): 매 tactic prediction.
  • AlphaProof: 매 self-play + Lean 4.
  • DeepSeek-Prover: 매 open-source.
  • MiniF2F benchmark: 매 high-school math.

→ 매 AI + symbolic verify 의 best of both.

매 수학 의 응용

  • 매 Kepler conjecture (Hales, Coq).
  • 매 Four color theorem (Coq).
  • 매 Liquid tensor experiment (Lean, Scholze).
  • 매 Polynomial Freiman-Ruzsa (Tao 2024, Lean Mathlib).

매 software 의 응용

  • Compiler: CompCert.
  • OS kernel: seL4.
  • Crypto: HACL*.
  • Smart contract: Move (Aptos), K framework.
  • Distributed: TLA+ (AWS, Azure).
  • Hardware: Intel, ARM 의 formal verify.

매 limitation

  • Specification effort: 매 spec 의 write 가 매 expensive.
  • Computability: 매 undecidable problem 도 있음.
  • Scaling: 매 large code 의 effort 폭발.
  • Maintenance: 매 spec change → 매 proof update.

💻 패턴

Lean 4 (basic)

-- 매 definition
def factorial : Nat  Nat
  | 0 => 1
  | n + 1 => (n + 1) * factorial n

-- 매 theorem
theorem factorial_pos :  n, factorial n > 0 := by
  intro n
  induction n with
  | zero => simp [factorial]
  | succ k ih => 
    simp [factorial]
    exact Nat.mul_pos (Nat.succ_pos k) ih

Coq (Software Foundations)

Inductive nat : Type :=
  | O : nat
  | S : nat -> nat.

Fixpoint plus (n m : nat) : nat :=
  match n with
  | O => m
  | S n' => S (plus n' m)
  end.

Theorem plus_O_n : forall n : nat, plus O n = n.
Proof. intro n. simpl. reflexivity. Qed.

Z3 SMT

from z3 import *

x = Int('x')
y = Int('y')
solver = Solver()
solver.add(x + y == 10)
solver.add(x > 0, y > 0)
solver.add(x * y == 21)

if solver.check() == sat:
    print(solver.model())  # 매 [x = 7, y = 3] or [x = 3, y = 7]

TLA+ (distributed protocol)

EXTENDS Naturals, Sequences
VARIABLES queue, processed

Init == queue = <<>> /\ processed = {}

Enqueue(item) == 
  /\ queue' = Append(queue, item)
  /\ UNCHANGED processed

Dequeue ==
  /\ Len(queue) > 0
  /\ processed' = processed \cup {Head(queue)}
  /\ queue' = Tail(queue)

Spec == Init /\ [][Enqueue \/ Dequeue]_<<queue, processed>>

Invariant == \A x \in processed : x \notin Range(queue)

LLM tactic prediction (Lean-Gym style)

from lean_gym import LeanGym

gym = LeanGym()
gym.start_proof(theorem="factorial_pos")

while not gym.is_complete():
    state = gym.current_goal()
    tactic = llm.predict_tactic(state)  # 매 LLM 의 propose
    success = gym.apply_tactic(tactic)
    if not success: tactic = llm.regenerate(state, blacklist=[tactic])

🤔 결정 기준

문제 Tool
Smart contract Move / K / Dafny
OS kernel Isabelle / Coq
Compiler Coq (CompCert)
Crypto F* (HACL*)
Distributed protocol TLA+
Hardware SystemVerilog + formal
Math research Lean 4 + Mathlib
SAT/SMT Z3

기본값: 매 critical = Lean / Coq + Z3. 매 distributed = TLA+. 매 smart contract = K / Move.

🔗 Graph

🤖 LLM 활용

언제: 매 critical software (kernel, crypto, smart contract). 매 distributed protocol. 매 deep math. 매 ATP-LLM hybrid 의 research. 언제 X: 매 normal app (cost overrun). 매 spec 의 unclear (proof 의 X).

안티패턴

  • Spec 없이 prove: 매 wrong thing 의 prove.
  • 모든 code 의 prove: 매 ROI X.
  • Lean 의 proof 의 LLM 의 generate without check: 매 fake.
  • Spec 의 too weak: 매 trivial proof.
  • No maintenance: 매 bit-rot.
  • Tool lock-in: 매 ecosystem 의 lose.

🧪 검증 / 중복

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — tool spectrum + landmark + LLM hybrid + 매 Lean / Coq / Z3 / TLA+ code