pg_durable: Microsoft open sources in-database durable execution

Overview

  • pg_durable is seen as part of a broader 2026 trend of “Postgres as queue/orchestrator,” alongside other PG-based job/queue systems.
  • Discussion splits between those excited about “bring compute to the data” and those who strongly prefer orchestration and business logic in application code.

In-DB Durable Execution vs External Orchestrators

  • Supporters argue: if Postgres is the main stateful component, keeping workflows in-DB simplifies the stack, avoids extra infra (e.g., Temporal, Airflow), reduces latency/round-trips, and lets DB snapshots capture both data and workflow state.
  • Critics: for cross-system workflows, external orchestrators and DAG schedulers (Airflow, Temporal-like systems) are seen as a better fit; they keep control flow in familiar languages and toolchains.

Use Cases and Benefits Mentioned

  • Long-running / resumable workflows, ETL/AI pipelines, cron-like jobs with exactly-once semantics and checkpoint replay.
  • Ability to restore “point in time” with workflows included, useful for ETL and AI pipelines tightly coupled to DB state.
  • Helps when HTTP calls and processing are mostly data-centric and already targeting the database.

Concerns and Criticisms

  • “Smells like stored procedures”: fear of hidden business logic, poor observability, scaling pressure centralized on the DB, limited IO and external API integration.
  • Developer experience issues: awkward SQL/DSL syntax, hard debugging, limited tooling versus mainstream languages and CI/CD flows.
  • Worry about loading already-hard-to-scale databases with long-running jobs.

Tooling, Versioning, and Testing

  • Multiple comments ask how to version, test, debug, and release durable functions.
  • Some argue all DB objects (functions, triggers) should already be in source control and migrations; others say most teams lack mature DB workflows, so keeping logic in code is safer.

How pg_durable Behaves (from thread)

  • Workflows are defined and started via df.start(...), returning an instance ID.
  • df.wait_for_signal is per-instance and “exactly once” within that instance; repeated df.start calls create new instances.
  • Timeouts and error handling surface as workflow state and failed instances, but detailed error semantics are still somewhat unclear to readers.

Meta: Postgres, Extensions, and Rust

  • Side debate on whether Postgres is sufficiently extensible (many cite the rich extension ecosystem) versus calls for a Rust rewrite with a more modular architecture.
  • Some see in-DB orchestration as inner-platform effect; others see it as a pragmatic reuse of a mature, extensible system.