Skip to content

Vector Index

A data structure that organises stored vectors geometrically to enable fast approximate or exact similarity search without scanning every entry.

A vector index is the data structure that organises stored vectors so that similar ones can be found quickly, without comparing the query against every vector in the database. It is what transforms vector search from an exhaustive, linear scan into a fast operation that scales to billions of items.

Different index types organise vectors in different ways, each with its own trade-offs. Graph-based indexes like HNSW connect vectors to their neighbours and navigate the resulting graph; cluster-based indexes like IVF group vectors around centroids and search only the nearest clusters; flat indexes store everything and scan exhaustively for exact but slow results. The choice affects query speed, memory use, build time, and recall accuracy.

Choosing and tuning the vector index is one of the most consequential decisions in deploying vector search. The right index depends on the dataset size, latency requirements, memory budget, and how often data changes. Most production systems use approximate indexes that trade a small amount of recall for enormous gains in speed, with tuning parameters that let operators dial in the balance between accuracy and performance.