Linux Pipes Are Slow

Role and Usefulness of Pipes

  • Strong split between “pipes are archaic, avoid them” vs. “pipes are core to Unix composability.”
  • Many value pipes for shell scripting: replacing large scripts with one-liners using pipes/xargs; modular UNIX tools interconnected cheaply.
  • Others note pipes are not always appropriate, especially when latency, async I/O, or complex control are involved.

Performance, “Slowness,” and When It Matters

  • For most workloads, Linux pipes push tens of GB/s; many commenters say they’re not the bottleneck and are “fast enough,” like a Corolla vs. race car.
  • Some real-world cases (filesystems, storage frontends, high-throughput video pipelines) have hit pipe throughput or copying limits and moved to shared memory or other mechanisms.
  • Pipes are criticized for needless copying vs. zero-copy designs and for being slower than “long-distance” function calls or modern socket optimizations.

Nonblocking Semantics and Fragility

  • Confusion and correction around O_NONBLOCK: on Linux pipes, nonblocking is per file description; setting it on one end doesn’t alter semantics of the other.
  • Common bug: processes flipping nonblocking on shared FDs unexpectedly.
  • Using nonblocking pipes with stdio (e.g., printf) is generally unsafe because callers don’t handle EAGAIN / partial writes.

Kernel / Copy Details

  • Discussion of rep movsb vs SIMD: modern CPUs often accelerate rep movsb, but thresholds and best choices depend on CPU and data size.
  • Linux’s memcpy/memmove and thresholds are tuned and regularly updated; trade-off between peak speed and keeping code small and branch-light.
  • Some kernel disassembly details explained by retpoline/CONFIG_RETHUNK and SMAP (CLAC/STAC) patching.

Proposed Improvements and Alternatives

  • Proposal: kernel syscall exposing ringbuffers for file descriptors, including pipes, possibly mapped on both ends for zero-copy, poll/futex-friendly.
  • Concerns: more complex user-space semantics, potential for brittle behavior if not carefully designed, but others say it’s similar to shared memory + eventfd.
  • Suggestions to benchmark io_uring-based designs and domain sockets, especially for high-throughput video workflows.

Economics and Philosophy of Optimization

  • Debate over whether shaving a few percent off ubiquitous primitives is “worth it.”
  • One side: micro-optimizing pipes is premature for most users and adds complexity.
  • Other side: small, widespread gains compound globally (time, energy, emissions); optimizing core primitives is justified even if individual benefits are invisible.