2025-11-03

The Case Against PGVector

pgvector in Production vs. “Nobody Uses This”

Multiple commenters report heavy real-world pgvector usage (e.g., thousands of DBs, millions of vectors per DB), contradicting the “nobody runs this in production” framing.
Others confirm: it works well up to low-millions of vectors and modest write rates, but pain appears as data and throughput grow (index build times, RAM, query planning).
Some at very large scale (billions/trillions of vectors) say Postgres became unsuitable and they migrated to dedicated systems.

Index Builds, Memory Use, and Operational Tension

HNSW index builds on millions of vectors can consume 10+ GB and run for hours; people debate whether that’s “a lot” or trivial for a serious DB server.
Techniques mentioned: maintenance_work_mem, REINDEX CONCURRENTLY, staging tables, replicas, dual indexes – all workable, but add complexity and disk overhead.
Critics argue vector workloads (high-velocity inserts + ANN) stress Postgres’s design and force teams to become indexing and tuning experts.

Filtering, Query Planning, and Hybrid Search

Pre- vs. post-filtering is a real problem: highly selective filters plus LIMIT can return too few results, even when many relevant matches exist slightly further in vector space.
Iterative scans and parameters (ef_search, max_search_tuples, strict vs relaxed ordering) help but require understanding the planner and data distribution.
Extensions (e.g., pgvectorscale, IVF-based plugins, label-based filtering) and external systems (AlloyDB ScaNN, Vespa, Milvus, MongoDB vector, Redis Vector Sets) aim to support better filtered/hybrid search and scale.
Hybrid search (BM25 + vectors + rerankers, reciprocal rank fusion) is common; many see embeddings as a first-stage filter, not the whole solution.

Quantization and Binary Tricks

Several teams report strong results using quantization: half-precision storage and binary (1-bit) vectors for indexes, often with >95% recall.
Workflows: use binary vectors to cheaply shortlist candidates (e.g., top 100), then compute precise distances on full-precision vectors.
This dramatically shrinks index size (e.g., ~32x) and makes pgvector feasible at larger scales; some note it’s surprising how little quality is lost.

Postgres vs Dedicated Vector DBs

Pro-Postgres side: fewer moving parts, unified SQL, easier joins/filters, sovereignty over data, good enough for 95% of use cases (docs, support content, small RAG).
Pro–vector-DB side: better handling of continuous updates, large indexes, complex filters, and operational concerns (index rebuilds, sharding, consistency) without custom glue.
Some advise a separate Postgres instance just for vectors to isolate workloads; critics say at that point you might as well use a purpose-built vector store.

YAGNI, Architecture, and Hype

Strong thread around YAGNI: start with pgvector if you have ~100k vectors and simple needs; migrate later if you hit limits.
Others warn that pgvector looks fine at small scale but breaks subtly at larger scale (especially filtered search), so teams underestimate future pain.
General skepticism about shallow “hello world” blog posts for pgvector and AI infra; praise for experience-based writeups that expose real constraints.

Do We Even Need Vectors This Much?

Some argue vector search won’t “fade away” with larger LLM context windows: attention is costly, and indexing remains cheaper than scanning millions of tokens.
Others emphasize traditional lexical search (BM25/Lucene) plus query rewriting, expansion, and reranking often gets most of the benefit; embeddings help most in cross-language or clearly semantic queries.

Related topics