k-Nearest Neighbours, or k-NN, is the core query operation of a vector database: given a query vector, return the k stored vectors closest to it according to a chosen distance metric, ordered from most to least similar. The result is the k best matches, where k is whatever number the application asks for.
In its exact form, k-NN guarantees returning the true k closest vectors, but at scale this requires comparing the query against everything, which is too slow. Production systems therefore use approximate k-NN, powered by ANN indexes, which return the k approximately nearest vectors with very high probability of matching the true answer, at a tiny fraction of the cost.
The choice of k depends on the use case. Retrieval-augmented generation typically uses a small k — just enough relevant chunks to fit the model’s context — while recommendation systems might request a larger k to present a ranked list. A common pattern is to retrieve a generous k and then narrow it down with re-ranking or filtering before showing or using the final results.