Why LLMs Can't Write Q/Kdb+: Writing Code Right-to-Left

Right-to-left evaluation and why LLMs struggle

  • Core claim: q/kdb+/APL’s “right-to-left, no-precedence” (RL-NOP) evaluation clashes with LLMs’ left-to-right token generation, which favors building expressions forward from the start.
  • Humans can cope by moving the cursor around and reasoning non-linearly; LLMs effectively have an append-only interface, so they don’t “go back” to fix earlier parts unless explicitly instructed to revise.
  • Some argue the deeper issue is that LLMs don’t know when to switch from intuitive token prediction to deliberate reasoning, even if they “know” the rules when asked.

Training data and niche paradigms

  • Many commenters think corpus size is the main factor: LLMs excel at Python/Java but hallucinate in Rust, F#, Nix, q, etc., simply because there’s less high-quality public code.
  • Array/concatenative idioms (q, APL, forth-like languages) differ radically from imperative patterns the models have mostly seen, so they tend to generate imperative-flavored “pseudo-q” rather than idiomatic code.
  • Some report that newer large models (e.g. recent Claude versions) now handle custom or exotic languages surprisingly well, suggesting data + scale can overcome nonstandard syntax.

Directionality, notation, and human usability

  • Discussion compares q/APL to natural RTL languages (Hebrew, Arabic) but notes that Unicode stores text in logical order; RTL is mostly a display-layer concern, unlike q’s semantic RL evaluation.
  • There’s debate over whether RL-NOP is objectively harder to use or just unfamiliar:
    • Critics liken array languages to “one giant regex” that becomes unreadable in long-term code.
    • Fans argue RL-NOP supports “top-down” reading: you see the final operation first on the left and can often ignore the rest of the line.
  • Several note that popularity and familiarity, not inherent usability, largely determine which notations win.

Alternative architectures and workarounds

  • Multiple commenters suggest diffusion-based text models or encoder–decoder setups might better support non-sequential or bidirectional reasoning.
  • Ideas floated:
    • Train code models to generate/operate on ASTs instead of plain text.
    • Let models output edits or patches, not just linear sequences.
    • Use transpilers (e.g. Python-like “Qython” → q) as an interface layer.
    • Pre/post-process to translate dense notations (APL, RL-NOP) into more verbose, LLM-friendly forms.

Cognitive load and language design

  • Complex or unfamiliar syntaxes (deep parentheses in Lisp, J/APL glyphs, RL-NOP) appear to increase “cognitive load” for LLMs, analogous to human overload with distractions or tricky notation.
  • Some argue future language designers should intentionally make languages LLM-friendly, claiming that what’s hard for LLMs is often hard for most people.
  • Others strongly resist this, seeing it as optimizing for mediocrity and against powerful but niche notations.