Partitioners

A partitioner determines how data is distributed across the nodes in the cluster, including replicas. The partitioner function derives a token from the partition key, typically through hashing. The cluster distributes each row of data based on the token value. You can use any IPartitioner, including your own, as long as it is in the classpath.

You configure the partitioner in the cassandra.yaml file.

When you change the partitioner on a cluster, it makes existing records inaccessible and leads to data loss. You must reload all the data from another source, such as snapshots.

The default Murmur3Partitioner uses tokens to help assign equal portions of data to each node and evenly distribute data from all the tables throughout the ring or other grouping, such as a keyspace. This distribution works even if the tables use different partition keys, such as user names or timestamps. Because each part of the hash range receives an equal number of partitions on average, the read and write requests to the cluster are evenly distributed and load balancing is simplified. For more information, see Consistent hashing.

Your choice of token assignment depends on the type of architecture:

Virtual nodes: Use either the allocation algorithm or the random selection algorithm to specify the number of tokens distributed to nodes within the datacenter. All systems in the datacenter must use the same algorithm.
Single-token architecture: To ensure even data distribution across the cluster nodes, you must enter values in the initial_token parameter in cassandra.yaml on each node.

`Murmur3Partitioner` (default)

The Murmur3Partitioner uniformly distributes data across the cluster based on MurmurHash hash values. This hashing function creates a 64-bit hash value of the partition key with a possible range from -2⁶³ to +2⁶³-1. DataStax recommends this partitioner for new clusters in almost all cases because it performs three to five times better than the RandomPartitioner.

When using this partitioner, you can page through all rows using the TOKEN() function in a CQL query.

DataStax includes the legacy partitioners only for backwards compatibility.

`RandomPartitioner`

The RandomPartitioner uniformly distributes data evenly across the nodes using an MD5 hash value of the row key. The possible range of hash values is from 0 to 2¹²⁷ -1. Because it uses a cryptographic hash, which isn’t required by the database, it takes longer to generate the hash value than the Murmur3Partitioner.

When using this partitioner, you can page through all rows using the TOKEN() function in a CQL query.

RandomPartitioner is a legacy partitioner.

`ByteOrderedPartitioner`

The ByteOrderedPartitioner orders rows lexically by key bytes. DataStax does not recommend this partitioner because it requires significant administrative overhead to load balance the cluster, sequential writes can cause hot spots, and balancing for one table can result in uneven distribution for another table in the same cluster.

ByteOrderedPartitioner is a legacy partitioner.

Partitioners

`Murmur3Partitioner` (default)

`RandomPartitioner`

`ByteOrderedPartitioner`

Was this helpful?

Give Feedback

Partitioners

Murmur3Partitioner (default)

RandomPartitioner

ByteOrderedPartitioner

Was this helpful?

`Murmur3Partitioner` (default)

`RandomPartitioner`

`ByteOrderedPartitioner`