We replaced RAG with a virtual filesystem for our AI documentation assistant
RAG vs. Retrieval More Broadly
- Several commenters stress that “RAG” originally meant any retrieval mechanism, not “vector DB + embeddings”; vector search is just one option.
- People argue the community over-identified RAG with embeddings because semantic vector search became trendy and blogged about heavily.
- Others caution against the new pendulum swing: embeddings are still powerful for some problems; they’re just one tool alongside keyword, SQL, Lucene, etc.
Filesystem / Virtual Filesystem Approach
- Many like the idea of using a human-organized hierarchy (directories, docs, TOCs) as a natural knowledge graph that agents can traverse.
- The Mintlify approach is seen as swapping the interface, not the underlying retrieval: Chroma is still used; the agent “sees” a filesystem illusion instead of a vector API.
- Some note similar ideas: FUSE-based virtual FS, SQLite-backed FS, PageIndex-style hierarchical TOCs, or virtual FS over existing DBs.
- Others point out that plain full-text search (Postgres, Lucene, SQLite) or RAM disks could provide similar benefits without the complexity.
Agents, CLI Tools, and Bash Emulation
- Commenters observe that LLM agents are strongly tuned to use POSIX-like tools (
ls,grep,cat,find), so a fake shell is often easier than custom tool APIs. - There’s interest in FUSE or NFS mounts, but concerns about performance, infra overhead, and need for only a small subset of POSIX for read-only docs.
Skepticism and Tradeoffs
- Critics see the approach as overengineered: multiple LLM steps (
ls/greploops) may hurt latency versus a well-designed RAG/database pipeline. - Some argue the article understates what databases and search engines can already do (hierarchy, boolean, BM25, hybrid search), calling “just files + grep” a regression.
- Others note that success may stem less from “no vectors” and more from better chunking, larger logical units (whole files/sections), and preserved structure.
Use Cases and Explanations
- A side discussion explains RAG in plain terms for a fanfiction search use case: embed docs and queries, retrieve nearest neighbors, then feed both question and retrieved texts to the LLM.
- Several emphasize that optimal retrieval architecture is domain-dependent; no single pattern (RAG, FS, graph, DB) will dominate all information-access problems.