The Case Against PGVector
pgvector in Production vs. “Nobody Uses This”
- Multiple commenters report heavy real-world pgvector usage (e.g., thousands of DBs, millions of vectors per DB), contradicting the “nobody runs this in production” framing.
- Others confirm: it works well up to low-millions of vectors and modest write rates, but pain appears as data and throughput grow (index build times, RAM, query planning).
- Some at very large scale (billions/trillions of vectors) say Postgres became unsuitable and they migrated to dedicated systems.
Index Builds, Memory Use, and Operational Tension
- HNSW index builds on millions of vectors can consume 10+ GB and run for hours; people debate whether that’s “a lot” or trivial for a serious DB server.
- Techniques mentioned:
maintenance_work_mem,REINDEX CONCURRENTLY, staging tables, replicas, dual indexes – all workable, but add complexity and disk overhead. - Critics argue vector workloads (high-velocity inserts + ANN) stress Postgres’s design and force teams to become indexing and tuning experts.
Filtering, Query Planning, and Hybrid Search
- Pre- vs. post-filtering is a real problem: highly selective filters plus
LIMITcan return too few results, even when many relevant matches exist slightly further in vector space. - Iterative scans and parameters (
ef_search,max_search_tuples, strict vs relaxed ordering) help but require understanding the planner and data distribution. - Extensions (e.g., pgvectorscale, IVF-based plugins, label-based filtering) and external systems (AlloyDB ScaNN, Vespa, Milvus, MongoDB vector, Redis Vector Sets) aim to support better filtered/hybrid search and scale.
- Hybrid search (BM25 + vectors + rerankers, reciprocal rank fusion) is common; many see embeddings as a first-stage filter, not the whole solution.
Quantization and Binary Tricks
- Several teams report strong results using quantization: half-precision storage and binary (1-bit) vectors for indexes, often with >95% recall.
- Workflows: use binary vectors to cheaply shortlist candidates (e.g., top 100), then compute precise distances on full-precision vectors.
- This dramatically shrinks index size (e.g., ~32x) and makes pgvector feasible at larger scales; some note it’s surprising how little quality is lost.
Postgres vs Dedicated Vector DBs
- Pro-Postgres side: fewer moving parts, unified SQL, easier joins/filters, sovereignty over data, good enough for 95% of use cases (docs, support content, small RAG).
- Pro–vector-DB side: better handling of continuous updates, large indexes, complex filters, and operational concerns (index rebuilds, sharding, consistency) without custom glue.
- Some advise a separate Postgres instance just for vectors to isolate workloads; critics say at that point you might as well use a purpose-built vector store.
YAGNI, Architecture, and Hype
- Strong thread around YAGNI: start with pgvector if you have ~100k vectors and simple needs; migrate later if you hit limits.
- Others warn that pgvector looks fine at small scale but breaks subtly at larger scale (especially filtered search), so teams underestimate future pain.
- General skepticism about shallow “hello world” blog posts for pgvector and AI infra; praise for experience-based writeups that expose real constraints.
Do We Even Need Vectors This Much?
- Some argue vector search won’t “fade away” with larger LLM context windows: attention is costly, and indexing remains cheaper than scanning millions of tokens.
- Others emphasize traditional lexical search (BM25/Lucene) plus query rewriting, expansion, and reranking often gets most of the benefit; embeddings help most in cross-language or clearly semantic queries.