Skip to main content

A look at vector search

·2 mins

Why is vector search needed? #

With the rise of LLMs, retrieval-augmented generation (RAG) has become a common application. RAG takes a question, gathers relevant context, and uses the LLM to answer the question.

For example,

Question: What is the longest river in the world?

Context:

  • The Amazon river is 3976 miles long
  • The Yangtze river is 3917 miles long.
  • The Nile river is 4130 miles long.
  • …(possibly more information)

Answer: The Nile river is the longest in the world.

That second step of gathering context needs a few things to happen.

Even before the question is asked, knowledge (such as a set of documents) is converted to embeddings1. These embeddings are represented as vectors and stored in a vector database.

The asked question is converted to a vector and pieces of knowledge that are most relevant for that question are fetched.

What does vector search answer? #

The fundamental question is:

Given a query vector, which vectors in the database are most similar?

But what does similarity mean? How should these vectors be located?

A quick note on vector similarity #

Two vectors should be similar if the information they represent is similar. There are a few ways to quantify this:

  • Cosine similarity
  • Euclidean distance
  • Dot product similarity

Each of those factors in magnitude and direction differently, but all of them yield a number (similarity score) that’s useful for vector search.

Here’s a great overview about calculating these.

Vector search approaches #

Although there are several vector search algorithms, in practice, only a few come up:

  • k-nearest neighbors (k-NN)
  • locality-sensitive hashing (LSH)
  • hierarchical navigable small worlds (HNSW)