LearnWhat is a Vector Database?
The storage layer that made RAG and AI agents practical.
A vector database is a specialized data store for high-dimensional embedding vectors that supports fast approximate nearest neighbor (ANN) search. It lets you store millions or billions of vectors (typically 384-3072 dimensions each) and retrieve the closest ones to a query vector in milliseconds using indexes like HNSW or IVF. Vector databases are the storage layer underneath RAG, semantic search, and AI agent memory.
Free to startNo credit card requiredUpdated Apr 2026
In depth
Traditional databases index scalar values — strings, numbers, dates — for exact or range queries. Vector databases index dense float vectors and answer 'what are the K most similar vectors to this one' queries by cosine similarity or dot product. The math is simple, but doing it fast on billions of 1536-dimensional vectors is hard, which is why this became its own category of database.
The core technique is approximate nearest neighbor (ANN) indexing. Exact KNN is O(n) per query — too slow past a few hundred thousand vectors. ANN trades a small amount of recall (typically 95-99% of true nearest neighbors) for orders-of-magnitude speedup. The dominant algorithms are HNSW (hierarchical navigable small world graphs, fast and accurate, high memory), IVF (inverted file with product quantization, lower memory, slightly less accurate), and DiskANN (disk-backed, cheap at massive scale). Most production vector DBs implement HNSW as the default with IVF or DiskANN as options for scale.
The vendor landscape in 2026 breaks into four categories. (1) Managed pure-play: Pinecone (fast, expensive, easy), Weaviate Cloud, Qdrant Cloud, and Vertex AI Matching Engine. (2) Open-source self-hostable: Weaviate, Qdrant, Milvus, Vespa, and the veteran FAISS library from Meta. (3) Extensions to existing databases: pgvector for PostgreSQL, MongoDB Atlas Vector Search, Redis VSS, Elasticsearch dense_vector. (4) Embedded: Chroma, LanceDB — run in your process with zero ops. The right choice depends on scale, existing infrastructure, and whether you want managed ops. A small app under 1M vectors can use pgvector or Chroma happily. A production RAG service with 100M+ vectors usually picks Pinecone, Qdrant, or Weaviate for the dedicated tooling.
The quality of a vector database is measured by three axes. Recall: what fraction of true nearest neighbors the ANN index actually returns — you want 95%+. Query latency: typically 5-50ms for millions of vectors on SSD. Throughput: queries per second at your recall target. Index build time and memory footprint also matter at scale. Benchmarks like ANN-Benchmarks publish standardized comparisons but they don't capture real-world concerns like metadata filtering, hybrid search, or incremental index updates — which is where production systems actually differentiate.
For AI agents and RAG, the vector DB is usually not the interesting part of the stack — the embedding model and the retrieval/reranking logic matter more. But a wrong vector DB choice kills an agent's perceived intelligence by making retrieval slow (users see 3-second delays) or inaccurate (agent answers from the wrong document). Tycoon uses pgvector because agent workloads are under 10M vectors per project and PostgreSQL was already in the stack — adding a dedicated vector service would add ops complexity with no quality gain at that scale.