Real-time indexing is the ability to insert or update vectors and have them become searchable immediately, without waiting for a batch rebuild of the index. As soon as new data arrives and is embedded, it can be returned in query results, keeping the searchable corpus continuously fresh.
This matters for applications where data changes constantly and freshness is important — news, social feeds, chat memory, live product catalogues, or any system where a newly added item must be findable right away. Without real-time indexing, new data would only appear after the next scheduled rebuild, introducing a delay that some applications cannot tolerate.
Real-time indexing contrasts with batch indexing, which loads many vectors at once for maximum throughput but with less immediate availability. The trade-off is between freshness and ingestion efficiency: incremental real-time inserts minimise the delay before data is queryable, while batch loads maximise raw loading speed. Many systems support both, using batch loads for the initial corpus and real-time inserts for ongoing updates.