Vector Database

Simple Definition

A vector database stores data as embeddings — lists of numbers that represent meaning — and allows you to quickly search for the items most similar to a query.

Unlike a traditional database that searches by exact keywords, a vector database searches by semantic meaning. Ask for “documents about vehicle safety” and it returns content about cars, trucks, and road regulations — even if those exact words don’t appear in your query.

Why You Need a Specialized Database

Traditional databases (SQL, MongoDB) store structured data and search by exact values. They’re not designed for high-dimensional number arrays or similarity comparisons across millions of entries.

Vector databases are optimized specifically for:

  1. Storing millions of high-dimensional vectors efficiently
  2. Finding the nearest neighbors (most similar vectors) in milliseconds
  3. Handling the scale of modern AI applications

Common Vector Databases

  • Pinecone — managed cloud vector database
  • Weaviate — open-source, built-in embedding support
  • Chroma — lightweight, often used for local development
  • pgvector — vector extension for PostgreSQL
  • Qdrant — open-source, high performance

Where Vector Databases Are Used

  • RAG systems — store document embeddings, retrieve relevant context for AI responses
  • Semantic search — search by meaning, not just keywords
  • Recommendation engines — find content similar to what a user liked
  • Image search — find visually similar images
  • Embedding — the numerical representations stored in vector databases
  • RAG — retrieval-augmented generation uses vector databases for context retrieval
  • LLM — often combined with vector databases to build AI applications

See AI terms in action

Browse practical AI workflows that use the concepts in this glossary.

Last updated: