2025-05-12

Embeddings are underrated (2024)

Applications and Use Cases

Commenters share many concrete uses: semantic blog “related posts”, RSS aggregators with arbitrary categories, patent similarity search, literature and arXiv search, legal text retrieval, code search over local repos, and personal knowledge tools (e.g., Recallify).
Embeddings + classical ML (scikit-learn classifiers, clustering) are reported as practical and often “good enough” compared to fine‑tuning large language models, with vastly lower training cost.
For clustering, embeddings make simple algorithms like k‑means work much better than old bag‑of‑words vectors.
Some are exploring novel UX ideas like “semantic scrolling” and HNSW-based client‑side indexes for semantic browsing.

Search, RAG, and Technical Documentation

Many see semantic search as the most compelling use: matching on meaning rather than exact words, handling synonyms and fuzzy queries like “that feature that runs a function on every column”.
Hybrid search (keywords + embeddings) is reported as best in production: exact matches remain important, especially for jargon, while embeddings handle conceptual similarity.
For technical docs, embeddings are framed as a tool for:
- Better in‑site search and “more like this” suggestions.
- Improving “discoveryness” across large doc sets.
- Supporting work on three “intractable” technical-writing challenges (coverage, consistency, findability), though details are mostly deferred to future posts and patents.
In RAG, embeddings primarily serve as pointers back to source passages; more granular concept‑level citation is discussed, with GraphRAG suggested as promising.

Technical Nuances and Models

Evaluation, Limits, and Skepticism

Some readers find the article too introductory and vague, wanting earlier definitions, clearer thesis, and concrete “killer apps” for tech writers.
Others note embeddings are long-established in IR and recommender systems, so “underrated” mainly applies relative to LLM hype or within the technical-writing community.
Several caution that embeddings are “hunchy”: great for similarity and clustering, but not for precise logical queries or structured data conditions.
There is debate over whether text generation or embeddings will have the bigger long‑term impact on technical writing; many conclude the real power lies in combining both.

Performance, Deployment, and Ethics

Commenters emphasize that generating an embedding is roughly one forward pass (like one token of generation), with some extra cost for bidirectional models.
Lightweight open-source models (e.g., MiniLM, BGE, GTE, Nomic) are cited as small, fast, and sometimes outperforming commercial APIs on MTEB.
Client‑side embeddings using ONNX and transformers.js, with static HNSW‑like indexes in Parquet queried via DuckDB, are highlighted as near‑free, low‑latency options.
Ethical concerns focus on training data for embedding models, though many see embeddings as a strongly “augmentative” rather than replacement technology.

Related topics