Knowledge Base

📝 Context Summary

Vector databases store high-dimensional embeddings to enable semantic search. They are a core component of the SIE's Knowledge Pipeline, utilizing indexing strategies like HNSW and IVF, and supporting hybrid search to power Retrieval-Augmented Generation (RAG). Advanced implementations use pgvector for secure, multi-tenant data isolation.

Vector Databases

Vector databases are specialized systems designed to efficiently store, manage, and search high-dimensional vectors, such as those generated by machine learning models (embeddings). They are a foundational component for applications requiring semantic understanding, like Retrieval-Augmented Generation (RAG), recommendation engines, and semantic search.

Why Traditional Databases Fail

Traditional databases use indices like B-trees, which are optimized for exact matches and range queries on low-dimensional data (e.g., userID = 123 or price < 50). They are ineffective for similarity searches on high-dimensional embeddings because:
– They cannot efficiently index the geometric relationships between vectors.
– A brute-force search (calculating the distance between a query vector and every vector in the database) is computationally expensive and does not scale. For a million 1536-dimensional vectors, a single query can require over a billion floating-point operations.

Vector databases trade perfect accuracy for massive speed improvements by using Approximate Nearest Neighbor (ANN) search algorithms. The key insight is that finding the almost nearest neighbors is nearly as useful as finding the absolute nearest, but can be thousands of times faster.

Key ANN Indexing Algorithms

  1. HNSW (Hierarchical Navigable Small World):

    • Builds a multi-layered graph of vectors. Search starts at a sparse top layer and navigates down to denser layers, greedily moving toward the query vector.
    • Pros: Extremely fast and high-recall.
    • Cons: Requires significant RAM as the entire graph is held in memory.
  2. IVF (Inverted File Index):

    • Partitions the vector space into clusters using an algorithm like K-means.
    • During a search, it first identifies the most relevant clusters (nprobe parameter) and then only searches within them.
    • Pros: Lower memory usage than HNSW, suitable for datasets larger than RAM.
    • Cons: Can have lower recall at the same speed compared to HNSW.
  3. PQ (Product Quantization):

    • A compression technique that splits vectors into sub-vectors and quantizes them. This dramatically reduces the memory footprint (e.g., 700x+ compression) and speeds up distance calculations.
    • Pros: Massive memory savings and faster scans.
    • Cons: Loss of accuracy due to compression. Often used in combination with IVF (IVF-PQ) to create a powerful hybrid index.

The Recall vs. Latency Trade-off

A critical concept in production is tuning the balance between recall and latency:
Recall: The percentage of true nearest neighbors returned by the search.
Latency: The time it takes to complete a query.

Increasing the search scope (e.g., ef_search in HNSW, nprobe in IVF) improves recall but increases latency. For most applications, a recall of 90-95% is the sweet spot, as pushing for 99% can triple query time with negligible improvement in user-facing quality.

When Do You Need a Vector Database?

  • YES: When you have millions of vectors and require low-latency search for production systems (RAG, semantic search).
  • NO: For fewer than 100k vectors, a brute-force search with libraries like NumPy or FAISS is often fast enough and avoids operational overhead.
  • Managed Services: Pinecone
  • Open-Source (Self-Hosted): Weaviate, Chroma, Qdrant, Milvus
  • Extensions to Existing DBs:
    • Postgres: pgvector extension.
    • Elasticsearch/OpenSearch: Native support via HNSW indices.

Actionable Next Steps

  1. Create the New Note: Save the content above as a new note, kb/ai/Vector Databases.md.
  2. Update Agent-Related Notes: Now, go into your notes within kb/ai/2_agents. In any document that discusses agent memory, long-term memory, or RAG, you should add a link to this new, canonical note.

For example, in a note about RAG architecture, you could write:

“The retrieval component of a RAG system relies on a vector database to perform efficient similarity searches over the embedded document chunks.”

Key Concepts: Vector Embeddings Semantic Search Hybrid Search HNSW Indexing pgvector

About the Author: Adam Bernard

Vector Databases
Adam Bernard is a digital marketing strategist and SEO specialist building AI-powered business intelligence systems. He's the creator of the Strategic Intelligence Engine (SIE), a multi-agent framework that transforms business knowledge into autonomous, AI-driven competitive advantages.

Let’s Connect

Ready to Build Your Own Intelligence Engine?

If you’re ready to move from theory to implementation and build a Knowledge Core for your own business, I can help you design the engine to power it. Let’s discuss how these principles can be applied to your unique challenges and goals.