A partition key is a field whose value determines which partition or shard a vector is stored in. By choosing a key — such as tenant ID, region, or category — the database groups vectors with the same key value together, which can be used to isolate tenants or to co-locate related data for more efficient querying.
Partition keys serve two main purposes. For multi-tenancy, partitioning by tenant ID keeps each tenant’s vectors physically grouped, which strengthens isolation and lets queries target just one tenant’s partition rather than scanning shared data. For performance, partitioning by a frequently filtered attribute means that filtered queries can skip entire partitions that cannot match, narrowing the search before it begins.
The choice of partition key has lasting consequences. A well-chosen key aligns with how the data is queried, so most queries touch only a few partitions and stay fast; a poorly chosen key can create unbalanced partitions or force queries to fan out across many of them. Because repartitioning later is costly, the partition key is an important early design decision in a scalable vector system.