Skip to content

Vectorizer

A module or integration that automatically converts raw data into embeddings during ingestion, eliminating the need to manage embedding generation separately.

A vectorizer is a module or integration that automatically converts raw data into embeddings during ingestion, so the database produces and stores the vectors for you rather than requiring you to generate them yourself. You supply text or images, choose a model, and the vectorizer handles turning them into vectors as they are inserted.

This capability is the engine behind integrated vectorization, and it removes a large amount of plumbing. Without a vectorizer, you must run your own embedding pipeline — calling a model, managing batching and rate limits, keeping vectors in sync as data changes, and re-embedding everything when you switch models. A built-in vectorizer folds that work into the database, often configured simply by selecting which embedding model to use.

The benefit is faster development and less infrastructure to maintain, which is particularly valuable for smaller teams. The trade-off is reliance on the database’s supported models and its handling of the embedding process. Many systems make the vectorizer optional, letting you use it for convenience or supply your own pre-computed vectors when you need full control over how embeddings are generated.