Too Many Open Files

Debate: Are file descriptor limits still justified?

  • One camp argues per‑process FD caps are arbitrary, poor proxies for actual resources (memory, CPU) and distort program design. They suggest directly limiting kernel memory or other real resources instead.
  • Others insist limits are essential to contain buggy or runaway programs, especially on multi‑user systems, and prefer “too many open files” to a frozen machine.
  • There’s philosophical tension between “everything is a file” and “you can only have N files open,” with some seeing limits as legacy relics and others as necessary quotas.

Historical and kernel-level reasons

  • Early UNIX likely used fixed‑size FD tables; simple arrays are easier to implement and reason about.
  • Kernel memory for FD state isn’t swappable, so unconstrained growth can have nastier OOM behavior than userland leaks.
  • FD limits also act as a guardrail against FD leaks; hitting the cap can reveal bugs.

Real-world needs and bugs

  • Many modern workloads legitimately need tens or hundreds of thousands of FDs: high‑connection frontends, Postgres, nginx, Docker daemons, IDEs, recursive file watchers, big test suites.
  • People share war stories of FD leaks (e.g., missing fclose, leaking sockets) causing random failures, empty save files, or failures only on large inputs.
  • VSCode’s higher internal limit hid FD problems that showed up in normal shells.

APIs, select(), and FD_SETSIZE

  • A major practical constraint is the classic select() API and glibc’s fixed FD_SETSIZE (typically 1024) for fd_set. FDs ≥ this break code using select().
  • Man pages now explicitly recommend using poll/epoll/platform‑specific multiplexers instead of select.
  • People describe hacks to avoid third‑party libraries’ select() limits (e.g., pre‑opening dummy FDs so “real” FDs stay below 1024).

OS-specific behavior

  • macOS is criticized for very low defaults and undocumented extra limits for sandboxed apps; raising kernel sysctls has caused instability for some.
  • Linux defaults (e.g., 1024) are widely considered too low for modern machines; values like 128k or 1M are seen as reasonable on servers.
  • Windows handles are contrasted: more types, effectively limited by memory, not a small hard cap.

Proposed practices and tooling

  • Common advice: raise the soft limit to the hard limit at program startup (Go does this automatically; similar Rust snippet shown), and configure higher hard limits on servers.
  • Others caution this is a band‑aid unless you first understand why so many FDs are needed.
  • Tools like lsof, fstat, and htop help inspect FD usage, though lsof’s noisy output is criticized.