2026-04-02

We replaced RAG with a virtual filesystem for our AI documentation assistant

RAG vs. Retrieval More Broadly

Several commenters stress that “RAG” originally meant any retrieval mechanism, not “vector DB + embeddings”; vector search is just one option.
People argue the community over-identified RAG with embeddings because semantic vector search became trendy and blogged about heavily.
Others caution against the new pendulum swing: embeddings are still powerful for some problems; they’re just one tool alongside keyword, SQL, Lucene, etc.

Filesystem / Virtual Filesystem Approach

Many like the idea of using a human-organized hierarchy (directories, docs, TOCs) as a natural knowledge graph that agents can traverse.
The Mintlify approach is seen as swapping the interface, not the underlying retrieval: Chroma is still used; the agent “sees” a filesystem illusion instead of a vector API.
Some note similar ideas: FUSE-based virtual FS, SQLite-backed FS, PageIndex-style hierarchical TOCs, or virtual FS over existing DBs.
Others point out that plain full-text search (Postgres, Lucene, SQLite) or RAM disks could provide similar benefits without the complexity.

Agents, CLI Tools, and Bash Emulation

Commenters observe that LLM agents are strongly tuned to use POSIX-like tools (ls, grep, cat, find), so a fake shell is often easier than custom tool APIs.
There’s interest in FUSE or NFS mounts, but concerns about performance, infra overhead, and need for only a small subset of POSIX for read-only docs.

Skepticism and Tradeoffs

Critics see the approach as overengineered: multiple LLM steps (ls/grep loops) may hurt latency versus a well-designed RAG/database pipeline.
Some argue the article understates what databases and search engines can already do (hierarchy, boolean, BM25, hybrid search), calling “just files + grep” a regression.
Others note that success may stem less from “no vectors” and more from better chunking, larger logical units (whole files/sections), and preserved structure.

Use Cases and Explanations

A side discussion explains RAG in plain terms for a fanfiction search use case: embed docs and queries, retrieve nearest neighbors, then feed both question and retrieved texts to the LLM.
Several emphasize that optimal retrieval architecture is domain-dependent; no single pattern (RAG, FS, graph, DB) will dominate all information-access problems.

Related topics