BM25, short for Best Match 25, is the standard algorithm for ranking documents by keyword relevance. It scores how well a document matches a query based on how often the query terms appear in the document (term frequency), how rare those terms are across the whole corpus (inverse document frequency), and an adjustment for document length so that long documents do not unfairly dominate.
Despite being decades old, BM25 remains a remarkably strong baseline and is the workhorse behind most traditional search engines. It excels precisely where semantic vector search struggles: matching exact terms like product codes, proper nouns, error messages, and rare jargon that embedding models may not represent reliably.
In modern systems, BM25 provides the sparse, lexical half of hybrid search. Its keyword-precise scores are fused with the meaning-aware scores from vector search, giving results that capture both exact terminology and semantic intent. Because BM25 requires no machine learning or GPUs, it is also cheap and fast to run.