Use Prolog to improve LLM's reasoning

Symbolic AI Revival & Historical Context

  • Several comments note the similarity to 1980s “Fifth Generation” / expert systems and see a broader “renaissance of programming languages” and GOFAI techniques.
  • Others recall that symbolic AI previously hit limits and warn an “AI winter” could repeat if expectations are unrealistic.

Why Combine LLMs with Prolog / Logic

  • Many see logic programming as a natural extension of Chain-of-Thought: externalizing reasoning steps into a formal, auditable program (“program-as-thought”).
  • Prolog is praised as both a logical formalism and computational language, good for expressing constraints, rules, and world models.
  • Declarative specs (Prolog, SQL, Datalog, Z3, etc.) plus LLMs are seen as especially promising for planning, verification, and complex querying.

Skepticism & Limits

  • Strong pushback that Prolog is “not magic”: if the LLM misformalizes the problem, Prolog cannot fix it (“garbage in – Prolog out”).
  • Some argue success is likely cherry‑picked for puzzle‑like domains already well represented in training data.
  • Chain-of-Thought is criticized as often only helping when the prompter already knows the solution; cited work suggests fragile, domain‑specific gains.
  • Others call the belief that a non‑reasoning LLM can reliably write good Prolog “magical thinking,” noting Prolog is hard even for humans.

Practical Experiences & Tools

  • Reports of mixed results: some find GPT‑4 “doesn’t know Prolog well enough” for complex code, others show working pipelines (Prolog or Z3) for logic puzzles and logistics/planning tasks.
  • One real‑world case (clinical trial constraints → Prolog predicates → queries) claims dramatic accuracy improvements over pure LLM prompts.
  • There is interest in synthetic NL→Prolog datasets and better Prolog‑aware models.

Alternatives, Ecosystem & Adoption Issues

  • Related tech discussed: Datalog (e.g., CodeQL), constraint solvers (CLPFD, MiniZinc, Conjure), SMT (Z3), rules engines (Drools/RETE), theorem provers (Coq), and law‑oriented languages (Catala).
  • Prolog’s steep learning curve, tricky backtracking/termination, and historical baggage (expert systems hype, legal domain disappointments) are cited as reasons it never became mainstream.