Skip to content

Recall Cliff

A sharp drop in result quality or spike in latency that occurs when a restrictive metadata filter leaves too few candidates for an ANN index to navigate effectively.

The recall cliff is a sudden collapse in search quality — or a sharp spike in latency — that happens when a metadata filter becomes restrictive enough that a vector index can no longer navigate effectively. Up to a point, filtered search behaves well, then beyond some threshold of selectivity it falls off abruptly, like stepping off a cliff.

The cause lies in how graph-based indexes like HNSW work. They rely on a well-connected graph to walk toward nearest neighbours, but when a filter excludes most vectors, the surviving ones may be sparsely connected or unreachable through the graph, so the search wanders among disqualified nodes and misses true matches. Recall plummets, or the system falls back to a slow exhaustive scan and latency surges.

Avoiding the recall cliff is a major design goal for vector databases that must serve filtered queries. Techniques such as in-graph filtering, adaptive traversal, filter-aware indexing, and efficient pre-filtering are all aimed at keeping recall and speed stable even under highly selective or high-cardinality filters, so that real-world queries combining similarity with tight constraints do not silently degrade.