A look at vector search
Why is vector search needed? #
With the rise of LLMs, retrieval-augmented generation (RAG) has become a common application. RAG takes a question, gathers relevant context, and uses the LLM to answer the question.
For example,
Question: What is the longest river in the world?
Context:
- The Amazon river is 3976 miles long
- The Yangtze river is 3917 miles long.
- The Nile river is 4130 miles long.
- …(possibly more information)
Answer: The Nile river is the longest in the world.
That second step of gathering context needs a few things to happen.
Even before the question is asked, knowledge (such as a set of documents) is converted to embeddings1. These embeddings are represented as vectors and stored in a vector database.
The asked question is converted to a vector and pieces of knowledge that are most relevant for that question are fetched.
What does vector search answer? #
The fundamental question is:
Given a query vector, which vectors in the database are most similar?
But what does similarity mean? How should these vectors be located?
A quick note on vector similarity #
Two vectors should be similar if the information they represent is similar. There are a few ways to quantify this:
- Cosine similarity
- Euclidean distance
- Dot product similarity
Each of those factors in magnitude and direction differently, but all of them yield a number (similarity score) that’s useful for vector search.
Here’s a great overview about calculating these.
Vector search approaches #
Although there are several vector search algorithms, in practice, only a few come up:
- k-nearest neighbors (k-NN)
- locality-sensitive hashing (LSH)
- hierarchical navigable small worlds (HNSW)