Consistent hashing

Consistent hashing allows data distribution across a cluster to minimize reorganization when nodes are added or removed. Consistent hashing partitions data based on the partition key. For an explanation of partition keys and primary keys, see the Data modeling example in CQL for DataStax Enterprise 5.1.

For example, if you have the following data:

Example data
name age car gender

jim

36

camaro

M

carol

37

bmw

F

johnny

12

M

suzy

10

F

The database assigns a hash value to each partition key:

Assigning hash values
Partition key Murmur3 hash value

jim

-2245462676723223822

carol

7723358927203680754

johnny

-6723372854036780875

suzy

1168604627387940318

Each node in the cluster is responsible for a range of data based on the hash value.

Arch hash value range
Arch hash value range
Hash value ranges
Node Start range End range Partition key Hash value

1

-9223372036854775808

-4611686018427387904

johnny

-6723372854036780875

2

-4611686018427387903

-1

jim

-2245462676723223822

3

0

4611686018427387903

suzy

1168604627387940318

4

4611686018427387904

9223372036854775807

carol

7723358927203680754

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com