--- id: wiki-2026-0508-probability-and-logic-fusion title: Probability and Logic Fusion category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Probabilistic Logic, StaR-AI, Statistical Relational Learning, Neuro-Symbolic AI] duplicate_of: none source_trust_level: A confidence_score: 0.85 verification_status: applied tags: [neuro-symbolic, probabilistic-programming, knowledge-representation, reasoning] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: Pyro / PyMC / DeepProbLog / Scallop --- # Probability and Logic Fusion ## 매 한 줄 > **"매 unify symbolic logic (rules, KGs) with probability (uncertainty) — and now neural networks"**. 1990s-2010s 의 Statistical Relational Learning (SRL) 의 lineage: PRMs, MLNs, ProbLog, Bayesian networks + FOL. 2020s 에 neuro-symbolic 으로 reborn (DeepProbLog, Scallop, Logical Neural Networks, Differentiable Theorem Provers). 2026 currently driving verifiable LLM reasoning. ## 매 핵심 ### 매 problem statement - **Logic alone**: brittle to noise, uncertainty, exceptions. - **Probability alone**: no compositional / relational structure. - **Neural alone**: opaque, no symbolic guarantees. - **Goal**: compositional + uncertain + learnable. ### 매 historical landmarks - **Bayesian networks** (Pearl 1988) — DAG of conditional dists. - **PRMs** (Friedman et al 1999) — BNs over relational schemas. - **Markov Logic Networks** (Richardson & Domingos 2006) — FOL formulas with weights. - **ProbLog** (De Raedt et al 2007) — probabilistic Prolog. - **PSL** (Bach et al 2017) — soft logic with hinge-loss inference. - **DeepProbLog** (Manhaeve et al 2018) — neural predicates inside ProbLog. - **Scallop** (Li et al 2023) — differentiable Datalog for ML. - **Logical Neural Networks** (Riegel et al 2020 IBM). ### 매 representations - **MLN**: weighted FOL formulas → ground Markov network. - P(world) ∝ exp(Σ w_i × #true_groundings(F_i)). - **ProbLog**: Prolog clauses with probabilities `0.7::burglary.` - **PSL**: soft truth values in [0,1], conjunction = Lukasiewicz t-norm. - **DeepProbLog**: `nn(mnist_net, [X], Y, [0..9]) :: digit(X, Y).` ### 매 modern (2024-2026) directions - **LLM + verifier** (Lean, Coq, Z3): generate → check → repair. AlphaProof, AlphaGeometry style. - **Differentiable logic**: gradients through soft-logic for end-to-end training. - **Neuro-symbolic agents**: LLM generates programs, symbolic engine executes. ## 💻 패턴 ### Markov Logic Network (PRACMLN-style) ```python # Formulas with weights formulas = [ (1.5, "Smokes(x) => Cancer(x)"), (1.1, "Friends(x,y) ^ Smokes(x) => Smokes(y)"), ] # P(world) ∝ exp(Σ w * count_true_groundings) # Inference: MC-SAT or Gibbs sampling over ground atoms. ``` ### ProbLog example ```prolog 0.1 :: burglary. 0.2 :: earthquake. alarm :- burglary. alarm :- earthquake. 0.7 :: john_calls :- alarm. query(burglary). evidence(john_calls, true). ``` ### DeepProbLog (neural predicate) ```python # Recognize MNIST digits and add them network = MNIST_Net() nn(mnist_net, [X], Y, [0,1,2,3,4,5,6,7,8,9]) :: digit(X, Y). addition(X, Y, Z) :- digit(X, A), digit(Y, B), Z is A + B. # Train: end-to-end gradient flows through neural digit predicate # from supervision on (image1, image2, sum_label). ``` ### Pyro probabilistic program (Bayesian + structure) ```python import pyro, pyro.distributions as dist, torch def model(data): # Latent disease probability p_disease = pyro.sample("p_disease", dist.Beta(1., 9.)) for i, (test, outcome) in enumerate(data): d = pyro.sample(f"d_{i}", dist.Bernoulli(p_disease)) # logical rule: P(test+ | disease) = 0.95, P(test+ | not disease) = 0.1 p_test = 0.95 * d + 0.1 * (1 - d) pyro.sample(f"t_{i}", dist.Bernoulli(p_test), obs=test) ``` ### LLM + Z3 verifier (2024-2026 pattern) ```python from z3 import Solver, Int, And, sat def llm_solve_with_check(problem): code = claude.complete(f"Translate to Z3 Python: {problem}") s = Solver() exec(code) # populates s if s.check() == sat: return s.model() else: return claude.complete(f"Z3 returned UNSAT. Repair: {code}") ``` ### PSL soft-logic rule ```python # Lukasiewicz t-norm: A ^ B = max(0, A + B - 1); A => B = min(1, 1 - A + B). # Rule: similar(p,q) ^ likes(p, x) => likes(q, x) [weight 5] # Inference: minimize Σ w_i * max(0, body - head) over continuous truth values. ``` ### Scallop differentiable Datalog ```python import scallopy ctx = scallopy.ScallopContext(provenance="difftopkproofs") ctx.add_relation("digit", (int, float), input_mapping=[(0,), (1,), ...]) ctx.add_rule("sum(a + b) = digit(_, a), digit(_, b)") # Plug into PyTorch; gradients flow through proof structure. ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Discrete random variables, known structure | Bayesian network (pgmpy) | | First-order rules + data | MLN / ProbLog | | Soft constraints, large scale | PSL | | Neural perception + symbolic reasoning | DeepProbLog / Scallop | | LLM reasoning correctness | LLM + Z3/Lean verifier | | Complex generative model | Pyro / PyMC | **기본값**: For neuro-symbolic 2026 — Scallop 또는 LLM+verifier; for pure SRL, ProbLog. ## 🔗 Graph - 부모: [[Knowledge-Representation]] · [[Logic]] - 변형: [[Bayesian-Network]] - 응용: [[Neural-Symbolic-Integration|Neuro-Symbolic-AI]] · [[Knowledge-Graphs]] ## 🤖 LLM 활용 **언제**: domain with both structured rules and uncertainty, verifiable LLM reasoning, knowledge-graph completion w/ noise. **언제 X**: pure pattern recognition (use NN), purely deterministic logic (use Prolog/Datalog). ## ❌ 안티패턴 - **MLN at scale**: grounding explodes; use lifted inference or PSL. - **Probabilities as confidence scores**: must reflect actual frequencies / coherent priors. - **Mixing neural and symbolic without gradient story**: end-to-end requires differentiable bridge. - **Ignoring computational cost**: many SRL inferences are #P-hard. ## 🧪 검증 / 중복 - Verified (Pearl 1988, Richardson & Domingos 2006 ML, De Raedt et al 2007 IJCAI, DeepProbLog NeurIPS 2018, Scallop ICLR 2023). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — full SRL → neuro-symbolic timeline + 2026 LLM+verifier |