The Case Against PGVector

pgvector in Production vs. “Nobody Uses This”

  • Multiple commenters report heavy real-world pgvector usage (e.g., thousands of DBs, millions of vectors per DB), contradicting the “nobody runs this in production” framing.
  • Others confirm: it works well up to low-millions of vectors and modest write rates, but pain appears as data and throughput grow (index build times, RAM, query planning).
  • Some at very large scale (billions/trillions of vectors) say Postgres became unsuitable and they migrated to dedicated systems.

Index Builds, Memory Use, and Operational Tension

  • HNSW index builds on millions of vectors can consume 10+ GB and run for hours; people debate whether that’s “a lot” or trivial for a serious DB server.
  • Techniques mentioned: maintenance_work_mem, REINDEX CONCURRENTLY, staging tables, replicas, dual indexes – all workable, but add complexity and disk overhead.
  • Critics argue vector workloads (high-velocity inserts + ANN) stress Postgres’s design and force teams to become indexing and tuning experts.

Filtering, Query Planning, and Hybrid Search

  • Pre- vs. post-filtering is a real problem: highly selective filters plus LIMIT can return too few results, even when many relevant matches exist slightly further in vector space.
  • Iterative scans and parameters (ef_search, max_search_tuples, strict vs relaxed ordering) help but require understanding the planner and data distribution.
  • Extensions (e.g., pgvectorscale, IVF-based plugins, label-based filtering) and external systems (AlloyDB ScaNN, Vespa, Milvus, MongoDB vector, Redis Vector Sets) aim to support better filtered/hybrid search and scale.
  • Hybrid search (BM25 + vectors + rerankers, reciprocal rank fusion) is common; many see embeddings as a first-stage filter, not the whole solution.

Quantization and Binary Tricks

  • Several teams report strong results using quantization: half-precision storage and binary (1-bit) vectors for indexes, often with >95% recall.
  • Workflows: use binary vectors to cheaply shortlist candidates (e.g., top 100), then compute precise distances on full-precision vectors.
  • This dramatically shrinks index size (e.g., ~32x) and makes pgvector feasible at larger scales; some note it’s surprising how little quality is lost.

Postgres vs Dedicated Vector DBs

  • Pro-Postgres side: fewer moving parts, unified SQL, easier joins/filters, sovereignty over data, good enough for 95% of use cases (docs, support content, small RAG).
  • Pro–vector-DB side: better handling of continuous updates, large indexes, complex filters, and operational concerns (index rebuilds, sharding, consistency) without custom glue.
  • Some advise a separate Postgres instance just for vectors to isolate workloads; critics say at that point you might as well use a purpose-built vector store.

YAGNI, Architecture, and Hype

  • Strong thread around YAGNI: start with pgvector if you have ~100k vectors and simple needs; migrate later if you hit limits.
  • Others warn that pgvector looks fine at small scale but breaks subtly at larger scale (especially filtered search), so teams underestimate future pain.
  • General skepticism about shallow “hello world” blog posts for pgvector and AI infra; praise for experience-based writeups that expose real constraints.

Do We Even Need Vectors This Much?

  • Some argue vector search won’t “fade away” with larger LLM context windows: attention is costly, and indexing remains cheaper than scanning millions of tokens.
  • Others emphasize traditional lexical search (BM25/Lucene) plus query rewriting, expansion, and reranking often gets most of the benefit; embeddings help most in cross-language or clearly semantic queries.