Skip to content

MTEB (Massive Text Embedding Benchmark)

The standard benchmark for evaluating text embedding models across retrieval, classification, clustering, and semantic similarity tasks.

MTEB — the Massive Text Embedding Benchmark — is the standard benchmark for evaluating and comparing text embedding models. It tests models across a wide range of tasks, including retrieval, classification, clustering, re-ranking, and semantic similarity, over many datasets, producing a comprehensive picture of how well each model represents text.

MTEB exists because choosing an embedding model is one of the most consequential decisions in building a vector search system, and raw claims are hard to compare. By scoring many models on the same broad set of tasks, MTEB provides an objective leaderboard that helps teams shortlist candidates based on measured performance rather than marketing.

It is important to read MTEB results with judgement, however. A model that tops the overall leaderboard may not be best for your specific domain or language, and the benchmark cannot capture every real-world nuance. The retrieval-focused scores are most relevant for vector search, and they should be weighed alongside practical factors like embedding dimension, speed, cost, and how well the model fits your particular data.