Async hazard: MMAP is blocking IO

What “blocking” means here

  • Major subthread on terminology: some equate “blocking” with any synchronous operation; others reserve it for when a thread is descheduled (e.g., waiting on I/O or kernel events).
  • In async/cooperative runtimes, “blocking” is mainly about when control returns to the executor; a memory access that may trigger disk I/O is problematic if it cannot yield.
  • There’s confusion when people call all memory reads “blocking”; others argue that cached reads are synchronous but not meaningfully “blocking” for async design.

mmap and async runtimes

  • mmap makes file data look like memory, but page faults can take disk‑like latency while being invisible to the async scheduler.
  • With cooperative scheduling, a page fault can stall the whole executor thread even though the kernel only blocks the faulting OS thread.
  • This is framed by some as a limitation of mmap, by others as a fundamental tradeoff of cooperative user‑space scheduling.
  • Some suggest OS or language features (userfaultfd, async “prefetch” or async memcpy, madvise/io_uring patterns) to separate “schedule I/O” from “consume data,” but these are nontrivial.

Performance characteristics & tuning

  • mmap is powerful but has a wide gap between best and worst case; this complicates benchmarking and tail‑latency control.
  • Sequential preloading or MAP_POPULATE, MADV_* hints, and mlock/MAP_LOCKED can help, but are not guarantees.
  • Random access patterns could yield even worse behavior than shown in the article.
  • For some workloads (e.g., preloading at startup when memory is ample), mmap can be effectively non‑blocking in practice.

Error handling and reliability

  • mmap can turn latent I/O or media errors into SIGBUS at arbitrary code locations, not just explicit read/write calls.
  • On POSIX, SIGBUS is the main failure mode; comparison is drawn with -EIO/-ENOMEM from syscalls.
  • Debate over practicality: some say most programs would treat either as fatal; others argue explicit errors from read/write are easier to contextualize and handle gracefully than signal‑based failures.

Threads vs async

  • One camp sees this as an argument for “just use threads”: preemptive scheduling naturally hides page faults.
  • Others counter that async brings substantial benefits for I/O‑bound servers and can be made resilient if runtimes detect blocked workers and inject more threads (examples discussed from other ecosystems).
  • General advice: don’t casually mix mmap with cooperative async unless you understand the interaction; mmap is viewed as an expert‑level tool.