Consistent hashing
Details about how the consistent hashing mechanism distributes data across a cluster in Cassandra.
Consistent hashing partitions data based on the primary key. For example, if you have the following data:
jim | age: 36 | car: camaro | gender: M |
carol | age: 37 | car: bmw | gender: F |
johnny | age: 12 | gender: M | |
suzy | age: 10 | gender: F |
Cassandra assigns a hash value to each primary key:
Primary key | Murmur3 hash value |
---|---|
jim | -2245462676723223822 |
carol | 7723358927203680754 |
johnny | -6723372854036780875 |
suzy | 1168604627387940318 |
Each node in the cluster is responsible for a range of data based on the hash value:
Node | Murmur3 start range | Murmur3 end range |
---|---|---|
A | -9223372036854775808 | -4611686018427387903 |
B | -4611686018427387904 | -1 |
C | 0 | 4611686018427387903 |
D | 4611686018427387904 | 9223372036854775807 |
Cassandra places the data on each node according to the value of the primary key and the range that the node is responsible for. For example, in a four node cluster, the data in this example is distributed as follows:
Node | Start range | End range | Primary key | Hash value |
---|---|---|---|---|
A | -9223372036854775808 | -4611686018427387903 | johnny | -6723372854036780875 |
B | -4611686018427387904 | -1 | jim | -2245462676723223822 |
C | 0 | 4611686018427387903 | suzy | 1168604627387940318 |
D | 4611686018427387904 | 9223372036854775807 | carol | 7723358927203680754 |