Calculating tokens for single-token architecture nodes

When not using vnodes, use these steps to calculate tokens to evenly distribute data across a cluster.

This topic contains information on manually calculating tokens. DataStax recommends using Life Cycle Manager in DataStax Enterprise OpsCenter instead.

About single-token architecture 

Use single-token architecture when not using virtual nodes (vnodes). See Guidelines for using virtual nodes. You do not need to calculate tokens when using vnodes.

When you start a DataStax Enterprise cluster without vnodes, you must ensure that the data is evenly divided across the nodes in the cluster using token assignments and that no two nodes share the same token even if they are in different datacenters. Tokens are hash values that partitioners use to determine where to store rows on each node. This value determines the node's position in the ring and what data the node is responsible for. Each node is responsible for the region of the cluster between itself (inclusive) and its predecessor (exclusive).

As a simple example, if the range of possible tokens is 0 to 100 and there are four nodes, the tokens for the nodes are: 0, 25, 50, 75. This division ensures that each node is responsible for an equal range of data. For more information, see Data distribution and replication.

Before starting each node in the cluster for the first time, comment out the num_token property and assign an initial-token value in the cassandra.yaml configuration file.

Using the Token generator 

Usage:
  • Installer-Services and Package installations:
    token-generator num_of_nodes_in_dc ... [options]
  • Installer-No Services and Tarball installations:
    installation_location/token-generator num_of_nodes_in_dc ... [options]
Options Description
Partitioner
  • --murmur3
  • --random
Specify the partitioner:
  • Murmur3Partitioner uses a maximum possible range of hash values from -263 to +263-1. Default partitioner if not specified.
  • Random partitioner uses a range from 0 to 2127-1. Default partitioner before DataStax Enterprise 3.1/Cassandra 1.2.
Offset token values
  • --ringoffset offset
Use when adding or replacing dead nodes or datacenters.
Ring range
  • --ringrange range_start range_end
Specify token values within a specified range.

Calculating tokens for a single datacenter with one rack 

Example:
token-generator 6

DC #1:
  Node #1:  -9223372036854775808
  Node #2:  -6148914691236517206
  Node #3:  -3074457345618258604
  Node #4:                    -2
  Node #5:   3074457345618258600
  Node #6:   6148914691236517202

Calculating tokens for a single datacenter with multiple racks 

DataStax recommends that each rack have the same number of nodes so you can alternate the rack assignments.

  1. Calculate the tokens:
    token-generator 8
    
    DC #1:
      Node #1:  -9223372036854775808
      Node #2:  -6917529027641081856
      Node #3:  -4611686018427387904
      Node #4:  -2305843009213693952
      Node #5:                     0
      Node #6:   2305843009213693952
      Node #7:   4611686018427387904
      Node #8:   6917529027641081856
    
  2. Assign the tokens to nodes on alternating racks in the cassandra-rackdc.properties or the cassandra-topologies.properties file.
    Alternating rack assignments

Calculating tokens for a multiple datacenter cluster 

Note: Do not use SimpleStrategy for this type of cluster. You must use the NetworkTopologyStrategy. This strategy determines replica placement independently within each datacenter.
Example:
  1. Calculate the tokens:
    token-generator 4 4
    
    DC #1:
      Node #1:  -9223372036854775808
      Node #2:  -4611686018427387904
      Node #3:                     0
      Node #4:   4611686018427387904
    DC #2:
      Node #1:  -4690182801719768975
      Node #2:    -78496783292381071
      Node #3:   4533189235135006833
      Node #4:   9144875253562394737
  2. After calculating the tokens, assign the tokens so that the nodes in each datacenter are evenly dispersed around the ring.
    Token position and datacenter assignments
  3. Alternate the rack assignments as described above.

Calculating tokens when adding or replacing nodes/datacenters 

To avoid token collisions, use the --ringoffset option.
  1. Calculate the tokens with the offset:
    token-generator 3 2 --ringoffset 100
    The results show the generated token values for the Murmur3Partitioner for one datacenter with 3 nodes and one datacenter with 2 nodes with an offset:
    DC #1:
      Node #1:   6148914691236517105
      Node #2:  12297829382473034310
      Node #3:  18446744073709551516
    DC #2:
      Node #1:   9144875253562394637
      Node #2:  18368247290417170445
    The value of the offset is for the first node and all other nodes are calculated for even distribution from the offset.
    The tokens without the offset are:
    token-generator 3 2
    
    DC #1:
      Node #1:  -9223372036854775808
      Node #2:  -3074457345618258603
      Node #3:   3074457345618258602
    DC #2:
      Node #1:    -78496783292381071
      Node #2:   9144875253562394737
  2. After calculating the tokens, assign the tokens so that the nodes in each datacenter are evenly dispersed around the ring and alternate the rack assignments.
The location of the cassandra.yaml file depends on the type of installation:
Installer-Services /etc/dse/cassandra/cassandra.yaml
Package installations /etc/dse/cassandra/cassandra.yaml
Installer-No Services install_location/resources/cassandra/conf/cassandra.yaml
Tarball installations install_location/resources/cassandra/conf/cassandra.yaml
The location of the token-generator tool depends on the type of installation:
Installer-Services /usr/bin/token-generator
Package installations /usr/bin/token-generator
Installer-No Services install_location/resources/cassandra/tools/bin/token-generator
Tarball installations install_location/resources/cassandra/tools/bin/token-generator