Skip to content

BERT

A transformer-based language model by Google that produces contextual word embeddings, foundational to modern semantic search and NLP.

BERT — Bidirectional Encoder Representations from Transformers — is a language model introduced by Google in 2018 that reshaped natural language processing. Its key innovation was reading an entire sentence in both directions at once, so that the representation of each word reflects its full surrounding context, rather than only the words that came before it.

This bidirectional understanding made BERT produce far richer contextual embeddings than earlier models. The architecture became the foundation for a large family of text embedding models still in wide use. A particularly important descendant, Sentence-BERT, added a pooling step so that whole sentences and paragraphs could be turned into single fixed-length vectors — exactly what vector databases need for semantic search.

In a vector search pipeline, a BERT-based model is typically the component that converts text into the embeddings you store and query. The specific variant you choose affects embedding dimensionality, inference speed, and accuracy: smaller distilled versions are fast and cheap, while larger variants offer higher quality at greater cost.