Pnut: A C to POSIX shell compiler you can trust

Project goals and motivation

  • Intended as a C→POSIX shell compiler that produces human-readable scripts.
  • Main stated goal: help with “trusting trust” concerns and bootstrappable build chains (e.g., bootstrap Pnut itself, then a native backend, then TCC, then GCC) using only a POSIX shell and source.
  • Some commenters see it as clever but heading opposite their preference (would rather compile C to portable binaries, e.g., with other toolchains).
  • Others value it as an exploration of Unix “shell as glue” and as a conceptual demo of what POSIX sh can do.

Implementation approach and language subset

  • Uses only POSIX shell builtins (primarily read and printf), no external utilities, to maximize portability.
  • Memory is modeled via many numbered variables (_0, _1, …) and arithmetic expansion, since POSIX sh lacks arrays.
  • All compiler-generated variables hold numbers only, so code often omits quoting; this conflicts with common shell best practices and tools like ShellCheck.
  • C support is a restricted subset: missing or limited handling of unsigned types, static variables, arrays, glob.h, many libc and POSIX APIs (e.g., open modes, socket, lseek, mmap, pthread, setjmp, dlopen).
  • Some constructs “compile” to calls like _glob or _socket that are not implemented.
  • Pointers are mapped onto the same underlying representation as integers; parameter types like int vs int* are not distinguished.
  • Wrapping arithmetic and precise C undefined behavior are not modeled.

I/O and binary data

  • Examples include base64 and SHA-256 implemented within the constraints.
  • Input is read via read -r, which cannot handle NUL bytes; authors acknowledge base64 example doesn’t support full binary input.
  • Output of arbitrary bytes is possible using printf, enabling an x86 backend, but robust binary I/O in shell remains a limitation.

Performance and shell differences

  • Heavy use of many variables can be slow in some shells (e.g., dash does linear lookup over many variables), but authors report acceptable times for bootstrapping Pnut itself.
  • Benchmarks shared: for compiling pnut.c with pnut.sh, ksh is fastest, dash somewhat slower, bash slower still, and zsh much slower.
  • Subshells are noted as a major bottleneck; runtime library tries to avoid them.

Usefulness vs. practicality

  • Enthusiasts like that it stretches what POSIX sh can do and fits into bootstrapping/Stage0/bootstrappable-builds efforts.
  • Skeptics question why anyone would want to write C for shell-like tasks, or produce slower, less capable shell code instead of portable binaries.
  • Several argue most nontrivial shell scripts should instead be written in higher-level languages (Python, Rust, etc.) for maintainability and debuggability.
  • Debates spill into build systems: whether having a dedicated DSL (make, CMake, Meson) is better than using C itself; opinions are strongly split.

Trust, security, and messaging

  • “You can trust” tagline is widely criticized as marketing; readers say being told to trust something makes them more suspicious.
  • Others connect the “trust” framing explicitly to Ken Thompson’s “Reflections on Trusting Trust” and double-diverse compiling, arguing that human-auditable shell output from multiple independent shells can improve confidence.
  • Some note that trust still ultimately rests on the shell implementation and environment.

Critiques and open issues

  • Generated scripts trigger many ShellCheck warnings and errors; some are acknowledged as analyzer limitations, others less clearly so.
  • Error handling is often missing or oversimplified: examples like cp/cat have writes that don’t check for errors or partial writes.
  • Some operations emit explicit runtime “unknown mode” errors instead of implementing full behavior.
  • There are reports of code with undeclared identifiers compiling without diagnostics, suggesting poor or absent semantic checking.
  • Several commenters argue the project should more clearly document its C subset and limitations; others see the current state as an impressive but incomplete prototype.