2024-05-21

Storing knowledge in a single long plain text file

Perceived Idea & Goals

Proposal: store all tabular/knowledge data in one (or logically one) plain-text file, parsed via an indentation-based “ScrollSet”/Tree Notation grammar.
Data model: “concepts” ≈ records, “measures/measurements” ≈ fields/cells; strongly typed, hierarchical, git-backed, compiled to CSV/TSV/JSON.
Claim: system scales to large datasets and enables fully auditable, schema‑rich knowledge bases using only spaces/newlines plus a small syntax.

Novelty vs Prior Art

Many readers see this as a reinvention of old ideas: plain-text storage, Unix “everything is text,” and semantic/structured data formats.
Specific predecessors cited: GNU Recutils, Plan 9’s ndb, RDF/semantic web, CSV/JSON/YAML/TOML, TiddlyWiki, Wikidata.
Some argue the paper underplays prior work and should foreground comparisons more explicitly.
The article is updated to add Recutils and to list claimed advantages: easier hierarchies, less encoding overhead, first‑class comments, stronger integrity via “parsers.”

Tone, Presentation, and Reception

Title and style are widely read as tongue‑in‑cheek; some interpret it as satire or even “TimeCube‑like” grandiosity.
Others think the core idea is sincere but oversold, and that the ambitious framing distracts from the technical merits.
There’s debate over whether provocative tone attracts useful critique or just emotional backlash.

Alternative Tools and Related Approaches

Comparisons drawn to: Org mode, Emacs workflows, Obsidian (Markdown graph of files), “one big text file” note systems, Canon Cat, Notion “vault” tables, and conventional databases with views.
Several participants mention that high‑earning bug bounty hunters reportedly use a single large stuff.txt plus grep.

Technical Questions and Critiques

Concerns about namespace and indexing complexity; risk of “namespace hell.”
Questions about write safety, corruption, and reliance on git; suggestions to add hashing/copy‑on‑write ideas.
Confusion over terminology (“parsers,” “concepts,” “measurements”) and how definitions vs. data are distinguished.
Some feel the focus on the indentation trick and minimal syntax harms readability and visual salience compared to formats like Markdown.
Others argue this overcomplicates what databases already solve while losing features like robust querying and integrity constraints.

Enthusiasm and Potential

A subset of commenters are excited by text‑centric, programmable knowledge bases, especially combined with AI and Emacs‑like environments.
The extreme simplicity of plain text as the base abstraction is praised, even by some skeptics.

Related topics