Linux eliminates the strncpy API after six years of work, 360 patches

Null-Terminated Strings vs Length-Prefixed / “Real” String Types

  • Many argue NUL-terminated C strings are one of computing’s worst design choices: easy to overflow, hard to reason about, and performance traps (e.g., repeated strlen turning O(N) into O(N²)).
  • Others note they were a pragmatic trade-off on tiny 1970s machines; sentinel-terminated sequences (strings, pointer arrays, linked lists) were cheap and matched assembly practice.
  • Pascal-style or length-prefixed strings avoid missing-terminator bugs but bring issues: fixed-size length fields (historically 255 chars), confusion between bytes/codepoints/glyphs, and difficulty doing substrings without copying unless you move to “fat pointers” (pointer+length).
  • Modern variants discussed: D’s struct { size_t length; T* ptr; }, BSTRs and Pascal/Delphi/Free Pascal headers, dynamic/variable-length length encodings, and “span”/slice types. Trade-offs: memory overhead vs speed vs zero-copy slicing.

NULL, NUL, and Option Types

  • Thread repeatedly distinguishes NUL (byte 0 terminator) from NULL (invalid pointer).
  • Some see NULL as fine and necessary; the real problem is not being able to declare non-null pointers.
  • Others emphasize modern “option”/sum types (e.g., Option<T>) as the right way to represent “unset” values, enforced by the type system at compile time.
  • Debate over whether low-level environments can or should replace NULL with such constructs versus using them as a higher-level abstraction that compiles down to null-able representations.

Historical and Standards Context

  • Several comments stress C’s origins: tiny RAM (tens of KiB), single-pass compilers, PDP-11 addressing limits, and reuse of existing idioms. In that context, null-terminated strings and pointer/array decay were seen as clever compromises, not mistakes.
  • Later proposals like fat pointers/slices and safer APIs (e.g., strlcpy) are cited as missed opportunities; committees are criticized for adding complex features (VLAs, _Generic) while leaving the core C string model and stdlib essentially frozen.

strncpy, Kernel APIs, and Safety

  • strncpy is described as widely misused and counterintuitive: it pads with zeros, may not NUL-terminate when truncating, and was meant originally for fixed-size, padded fields (e.g., old directory entries), not as a “safe strcpy”.
  • Kernel’s replacement functions (strscpy, strscpy_pad, strtomem_pad, memcpy_and_pad, memcpy) encode clearer intent (terminated vs non-terminated, padding vs raw copy), at the cost of more APIs but better safety and performance signaling.
  • Some find this proliferation “convoluted”; others argue a single “Swiss army knife” would be slower and blur intent.

AI and Automated Refactoring

  • One line of discussion asks whether LLM-based coders could have automated most of the strncpy removal work.
  • Supporters report good experiences using LLMs to refactor many C programs and argue six years is too long for essentially mechanical changes.
  • Skeptics counter that in a kernel-scale project, the main bottlenecks are review, coordination, and testing, not just editing; subtle downstream behavior and reliance on old semantics must be carefully audited.
  • There is visible fatigue with AI being injected into every programming discussion, alongside curiosity about its potential for large-scale refactors.