Skip to content

Hamming Distance

A distance metric counting the number of positions at which two binary vectors differ, used in hash-based similarity search.

Hamming distance counts the number of positions at which two equal-length sequences differ. For two binary vectors, it is simply how many bits you would need to flip to turn one into the other — a count of mismatched bits.

In vector search, Hamming distance is the natural metric for binary vectors, which are produced by techniques that compress embeddings down to strings of bits for extreme efficiency. Comparing binary codes by Hamming distance is exceptionally fast because it reduces to a bitwise XOR followed by counting set bits, operations that modern processors execute in a handful of cycles.

This makes Hamming distance valuable for very large-scale or memory-constrained scenarios, and it underpins binary hashing methods such as some forms of locality-sensitive hashing. The trade-off is precision: collapsing a rich floating-point embedding into bits discards information, so binary codes and Hamming distance are often used as a fast first-pass filter, with full-precision vectors and a finer metric used to re-rank the top candidates.