Calculating tokens for single-token architecture nodes

About single-token architecture

Use single-token architecture when not using virtual nodes (vnodes). See Configuring and using virtual nodes (vnodes). You do not need to calculate tokens when using vnodes.

When you start a DataStax Enterprise cluster without vnodes, you must ensure that the data is evenly divided across the nodes in the cluster using token assignments and that no two nodes share the same token even if they are in different datacenters. Tokens are hash values that partitioners use to determine where to store rows on each node. This value determines the node's position in the ring and what data the node is responsible for. Each node is responsible for the region of the cluster between itself (inclusive) and its predecessor (exclusive).

As a simple example, if the range of possible tokens is 0 to 100 and there are four nodes, the tokens for the nodes are: 0, 25, 50, 75. This division ensures that each node is responsible for an equal range of data. For more information, see Data distribution and replication.

Before starting each node in the cluster for the first time, comment out the num_token property and assign an initial-token value in the cassandra.yaml configuration file.

Using the Token generator

Use for

Usage:

Installer-Services and Package installations:

token-generator num_of_nodes_in_dc ... [options]

Installer-No Services and Tarball installations:

installation_location/token-generator num_of_nodes_in_dc ... [options]

Options	Description
Partitioner `--murmur3` `--random`	Specify the partitioner: Murmur3Partitioner uses a maximum possible range of hash values from -2⁶³ to +2⁶³-1. Default partitioner if not specified. Random partitioner uses a range from 0 to 2¹²⁷-1. Default partitioner before DataStax Enterprise 3.1/Cassandra 1.2.
Offset token values `--ringoffset offset`	Use when adding or replacing dead nodes or datacenters.
Ring range `--ringrange range_start range_end`	Specify token values within a specified range.

Calculating tokens for a single datacenter with one rack

Example:

token-generator 6

DC #1:
  Node #1:  -9223372036854775808
  Node #2:  -6148914691236517206
  Node #3:  -3074457345618258604
  Node #4:                    -2
  Node #5:   3074457345618258600
  Node #6:   6148914691236517202

Calculating tokens for a single datacenter with multiple racks

DataStax recommends that each rack have the same number of nodes so you can alternate the rack assignments.

Calculate the tokens:

token-generator 8

DC #1:
  Node #1:  -9223372036854775808
  Node #2:  -6917529027641081856
  Node #3:  -4611686018427387904
  Node #4:  -2305843009213693952
  Node #5:                     0
  Node #6:   2305843009213693952
  Node #7:   4611686018427387904
  Node #8:   6917529027641081856

Assign the tokens to nodes on alternating racks in the cassandra-rackdc.properties or the cassandra-topologies.properties file.
Alternating rack assignments

Calculating tokens for a multiple datacenter cluster

Note: Do not use SimpleStrategy for this type of cluster. You must use the NetworkTopologyStrategy. This strategy determines replica placement independently within each datacenter.

Example:

Calculate the tokens:

token-generator 4 4

DC #1:
  Node #1:  -9223372036854775808
  Node #2:  -4611686018427387904
  Node #3:                     0
  Node #4:   4611686018427387904
DC #2:
  Node #1:  -4690182801719768975
  Node #2:    -78496783292381071
  Node #3:   4533189235135006833
  Node #4:   9144875253562394737

After calculating the tokens, assign the tokens so that the nodes in each datacenter are evenly dispersed around the ring.
Token position and datacenter assignments
Alternate the rack assignments as described above.

Calculating tokens when adding or replacing nodes/datacenters

To avoid token collisions, use the --ringoffset option.

Calculate the tokens with the offset:
```
token-generator 3 2 --ringoffset 100
```
The results show the generated token values for the Murmur3Partitioner for one datacenter with 3 nodes and one datacenter with 2 nodes with an offset:
```
DC #1:
  Node #1:   6148914691236517105
  Node #2:  12297829382473034310
  Node #3:  18446744073709551516
DC #2:
  Node #1:   9144875253562394637
  Node #2:  18368247290417170445
```
The value of the offset is for the first node and all other nodes are calculated for even distribution from the offset.
The tokens without the offset are:
```
token-generator 3 2

DC #1:
  Node #1:  -9223372036854775808
  Node #2:  -3074457345618258603
  Node #3:   3074457345618258602
DC #2:
  Node #1:    -78496783292381071
  Node #2:   9144875253562394737
```
After calculating the tokens, assign the tokens so that the nodes in each datacenter are evenly dispersed around the ring and alternate the rack assignments.

The location of the cassandra.yaml file depends on the type of installation:

Package installations	/etc/dse/cassandra/cassandra.yaml
Tarball installations	`install_location`/resources/cassandra/conf/cassandra.yaml

The location of the token-generator tool depends on the type of installation:

Installer-Services	/usr/bin/token-generator
Package installations	/usr/bin/token-generator
Installer-No Services	`install_location`/resources/cassandra/tools/bin/token-generator
Tarball installations	`install_location`/resources/cassandra/tools/bin/token-generator