Distributed systems programming has stalled

Burnout, culture, and observability

  • Several commenters echo the opening anecdote: modern distributed work often means chasing missing requests across many components, with poor observability and little organizational appetite to invest in better tools.
  • Some teams resist automation or tooling because manual debugging is seen as the “real work,” or as job security; others are simply burned out and defensive.
  • Where organizations do invest in structured logging, tracing, and APM, people report dramatic uptime and stress improvements—but also very high SaaS costs and frequent misconfiguration.

Embedded vs distributed work

  • Multiple people who switched from cloud/distributed back to embedded (often in Rust or C/C++) report higher satisfaction and a sense of control.
  • Others counter that embedded is also ugly: weak tooling, poor datasheets, non-existent remote observability, low pay, and heavy domain-specific math.
  • Several point out that modern embedded systems (cars, IoT platforms, power and battery controllers) are themselves complex distributed systems, just with different failure modes and buses.

Overuse of distributed systems & cloud-native

  • Strong sentiment that many companies adopt microservices, serverless, and Kubernetes without needing them, trading simple monoliths on powerful hardware for slower, costlier, more fragile systems.
  • Some argue distributed architectures are justified mainly by availability and organizational boundaries, not throughput; but admit teams often underestimate complexity and operational burden.

What “distributed system” really means

  • One camp says almost everything with a network connection is already a distributed system; the “rush” is just people finally recognizing that.
  • Another camp uses “distributed” to mean “multi-node, strongly coordinated architecture” and insists most businesses will never truly need that level of sophistication.

Difficulty, theory, and formal methods

  • Consensus that distributed systems are inherently hard—often compared to or harder than cryptography—because of explosion of state, timing, and failure modes.
  • Some say the deep theory (Lamport, Paxos, clocks, Byzantine faults) exists and “solved” the fundamentals decades ago; the real gap is practical programming models and verification tools that ordinary engineers can apply.

Existing models and stalled innovation

  • Erlang/Elixir, actor models, Unison, X10 “places,” Bloom, choreographic programming, and projects like Hydro are cited as promising or existing answers, but none have gone mainstream.
  • Commenters debate “static-location” (actors/microservices) vs “external-distribution” (databases, queues) vs “arbitrary-location/durable execution” (workflows, Temporal). Each trades control, performance, and cognitive load differently.

Education and skills gap

  • Many engineers never had a distributed systems course; several argue it should replace less broadly useful topics (like compilers) in standard curricula.
  • Others say most real expertise comes from on-the-job learning anyway; reading classic papers and modern courses (e.g., DDIA, MIT) is still rare.

LLMs and rising complexity

  • Some see distributed systems complexity about to spike further as LLMs become central components and even generate dynamic, non-repeatable code.
  • A few speculate LLMs might eventually help reason about or verify such systems, but for now they struggle even more with non-local, cross-component behavior.