2024-10-13

Use Prolog to improve LLM's reasoning

Symbolic AI Revival & Historical Context

Several comments note the similarity to 1980s “Fifth Generation” / expert systems and see a broader “renaissance of programming languages” and GOFAI techniques.
Others recall that symbolic AI previously hit limits and warn an “AI winter” could repeat if expectations are unrealistic.

Why Combine LLMs with Prolog / Logic

Many see logic programming as a natural extension of Chain-of-Thought: externalizing reasoning steps into a formal, auditable program (“program-as-thought”).
Prolog is praised as both a logical formalism and computational language, good for expressing constraints, rules, and world models.
Declarative specs (Prolog, SQL, Datalog, Z3, etc.) plus LLMs are seen as especially promising for planning, verification, and complex querying.

Skepticism & Limits

Strong pushback that Prolog is “not magic”: if the LLM misformalizes the problem, Prolog cannot fix it (“garbage in – Prolog out”).
Some argue success is likely cherry‑picked for puzzle‑like domains already well represented in training data.
Chain-of-Thought is criticized as often only helping when the prompter already knows the solution; cited work suggests fragile, domain‑specific gains.
Others call the belief that a non‑reasoning LLM can reliably write good Prolog “magical thinking,” noting Prolog is hard even for humans.

Practical Experiences & Tools

Reports of mixed results: some find GPT‑4 “doesn’t know Prolog well enough” for complex code, others show working pipelines (Prolog or Z3) for logic puzzles and logistics/planning tasks.
One real‑world case (clinical trial constraints → Prolog predicates → queries) claims dramatic accuracy improvements over pure LLM prompts.
There is interest in synthetic NL→Prolog datasets and better Prolog‑aware models.

Alternatives, Ecosystem & Adoption Issues

Related tech discussed: Datalog (e.g., CodeQL), constraint solvers (CLPFD, MiniZinc, Conjure), SMT (Z3), rules engines (Drools/RETE), theorem provers (Coq), and law‑oriented languages (Catala).
Prolog’s steep learning curve, tricky backtracking/termination, and historical baggage (expert systems hype, legal domain disappointments) are cited as reasons it never became mainstream.

Related topics