Consistent hashing
Consistent hashing allows data distribution across a cluster to minimize reorganization when nodes are added or removed. Consistent hashing partitions data based on the partition key. For an explanation of partition keys and primary keys, see the Data modeling example in CQL for DataStax Enterprise 5.1.
For example, if you have the following data:
name | age | car | gender |
---|---|---|---|
jim |
36 |
camaro |
M |
carol |
37 |
bmw |
F |
johnny |
12 |
M |
|
suzy |
10 |
F |
The database assigns a hash value to each partition key:
Partition key | Murmur3 hash value |
---|---|
jim |
-2245462676723223822 |
carol |
7723358927203680754 |
johnny |
-6723372854036780875 |
suzy |
1168604627387940318 |
Each node in the cluster is responsible for a range of data based on the hash value.
Node | Start range | End range | Partition key | Hash value |
---|---|---|---|---|
1 |
-9223372036854775808 |
-4611686018427387904 |
johnny |
-6723372854036780875 |
2 |
-4611686018427387903 |
-1 |
jim |
-2245462676723223822 |
3 |
0 |
4611686018427387903 |
suzy |
1168604627387940318 |
4 |
4611686018427387904 |
9223372036854775807 |
carol |
7723358927203680754 |