Vector Database
A vector database is a specialized database designed to store embedding vectors and retrieve the most semantically similar ones at speed. It's now core infrastructure for RAG pipelines, semantic search, recommendation systems, and long-term memory in AI agents.
A vector database is a specialized database designed to store embedding vectors and retrieve the most semantically similar ones at speed. It's now core infrastructure for RAG pipelines, semantic search, recommendation systems, and long-term memory in AI agents.
Why It Matters
Traditional SQL databases are optimized for exact keyword matches, but LLM-era search runs on semantic similarity. Finding the top-k nearest neighbors among millions of vectors (each 1,000+ dimensions) in a few milliseconds is essentially impossible with a general-purpose DB. Vector databases solve this with approximate nearest neighbor (ANN) algorithms, enabling the real-time retrieval that grounds LLM responses.
How It Works
- Embedding generation: Models like OpenAI and Cohere turn text or images into vectors.
- Indexing: ANN algorithms like HNSW, IVF, or PQ structure vectors for fast retrieval.
- Query embedding: The user query is vectorized by the same model.
- Similarity search: Top-k vectors are returned by cosine similarity, inner product, or Euclidean distance.
- Metadata join: Original text and metadata attached to each vector are returned and injected into the LLM's context.
Leading Solutions
Dedicated vector DBs:
- Pinecone: Managed service, simple API, fast to start
- Weaviate: Open source, strong hybrid (vector + keyword) search
- Qdrant: Rust-based, high performance, low resource use
- Milvus: Distributed vector search at scale
Vector extensions to existing DBs:
- pgvector: PostgreSQL extension — keeps vectors alongside SQL data
- Elasticsearch: Pairs keyword search with vector search
- Redis: In-memory vector search
Common pattern: start with pgvector, move to a dedicated DB once traffic and scale demand it.
Selection Criteria
Data size: Under 1M vectors, pgvector is fine. Beyond 100M vectors, a dedicated DB is essential.
Hybrid search: If you need vector + keyword + filter together, Weaviate or Elasticsearch shine.
Latency: User-facing apps target p99 ≤ 100ms. HNSW indexes are most reliable at this range.
Operational burden: Managed services like Pinecone cost more but remove nearly all infra work. Open source (Qdrant, Milvus) is the opposite.
Metadata filtering: If pre-filtering by category, date, or user ID is frequent, pick a solution that combines it efficiently with vector search.
GEO Implications
Blog operators rarely build their own vector DBs, but writing content so it lands well inside LLM and AI search vector databases matters. Clear headings, self-contained paragraphs, and concrete numbers and sources raise embedding quality — which means higher similarity scores when vector DBs choose what to cite.
Sources: