Data Tiering

Data tiering is the practice of storing data across multiple layers with different cost and performance characteristics — typically hot, warm, and cold tiers — based on how frequently each piece of data is accessed. Frequently queried vectors stay in fast, expensive memory, while rarely touched vectors move to cheaper, slower storage.

In a vector database, tiering keeps costs under control at scale. Holding billions of vectors entirely in RAM is prohibitively expensive, yet much of that data may be queried rarely. By keeping only the hot working set in memory and offloading the rest to disk or object storage, a system can serve a huge corpus at a fraction of the cost, accepting slightly higher latency for cold data.

Tiering is closely tied to tenant lifecycle management in multi-tenant systems. An inactive tenant’s vectors can be demoted to cold storage — even offloaded entirely — and promoted back to a hot tier on demand when the tenant becomes active again, ensuring you only pay for fast storage on data that actually needs it.