Skip to content

Integrated Vectorization

A database capability where embedding generation is handled automatically on data ingestion, removing the need to run a separate embedding model pipeline before storing vectors.

Integrated vectorization is a database capability where embeddings are generated automatically as data is ingested, so you can send raw text or images and let the database produce and store the vectors for you. The embedding step happens inside the database rather than in a separate pipeline you build and operate.

This removes a substantial chunk of the work in building a retrieval system. Without it, you must run your own embedding pipeline: calling an embedding model, handling rate limits and retries, keeping vectors in sync as source data changes, and re-embedding everything if you switch models. Integrated vectorization folds all of that into the database, often configured by simply choosing a model.

The benefit is dramatically less plumbing and faster development, especially for smaller teams. The trade-off is some loss of control — you rely on the database’s supported models and its handling of chunking and updates. Many databases offer it as an option rather than a requirement, letting you use integrated vectorization for convenience or supply your own vectors when you need full control.