Calculating tokens for single-token architecture nodes

Where is the cassandra-rackdc.properties file?

The location of the cassandra-rackdc.properties depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra-rackdc.properties

Tarball installations + Installer-No Services

<installation_location>/resources/cassandra/conf/cassandra-rackdc.properties

Where is the cassandra.yaml file?

The location of the cassandra.yaml file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra.yaml

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/cassandra.yaml

Where is the cassandra-topology.properties file?

The location of the cassandra-topology.properties file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra-topology.properties

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/cassandra-topology.properties

This topic contains information on manually calculating tokens.

DataStax recommends using Life Cycle Manager in DSE OpsCenter instead.

About single-token architecture

Use single-token architecture when not using virtual nodes (vnodes). See Guidelines for using virtual nodes. You do not need to calculate tokens when using vnodes.

When you start a DataStax Enterprise (DSE) cluster without vnodes, you must ensure that the data is evenly divided across the nodes in the cluster using token assignments and that no two nodes share the same token even if they are in different datacenters. Tokens are hash values that partitioners use to determine where to store rows on each node. This value determines the node’s position in the ring and what data the node is responsible for. Each node is responsible for the region of the cluster between itself (inclusive) and its predecessor (exclusive).

As a simple example, if the range of possible tokens is 0 to 100 and there are four nodes, the tokens for the nodes are: 0, 25, 50, 75. This division ensures that each node is responsible for an equal range of data. For more information, see Data distribution overview.

Before starting each node in the cluster for the first time, comment out the num_token property and assign an initial_token value in the cassandra.yaml configuration file.

Using the Token generator

Use token-generator tool for:

Usage:

  • Package and Installer-Services installations:

    token-generator num_of_nodes_in_dc ... [options]
  • Tarball and Installer-No Services installations:

    <installation_location>/resources/cassandra/tools/bin/token-generator num_of_nodes_in_dc ... [options]

    If no options are entered, Token Generator Interactive Mode is invoked.

Options
Options Description

Help

  • -h

  • --help

Show help.

Partitioner

  • --murmur3

  • --random

Specify the partitioner:

  • Murmur3Partitioner uses a maximum possible range of hash values from -263 to +263-1. Default partitioner if not specified.

  • Random partitioner uses a range from 0 to 2127-1. Default partitioner before DataStax Enterprise 3.1/Apache Cassandra™ 1.2.

Offset token values

  • --ringoffset offset

Use when adding or replacing dead nodes or datacenters.

Ring range

  • --ringrange range_start range_end

Specify token values within a specified range.

Test

  • --test

Displays various ring arrangements and generates an HTML file showing these arrangements.

Calculating tokens for a single datacenter with one rack

Example:

token-generator 6

DC #1:
  Node #1:  -9223372036854775808
  Node #2:  -6148914691236517206
  Node #3:  -3074457345618258604
  Node #4:                    -2
  Node #5:   3074457345618258600
  Node #6:   6148914691236517202

Calculating tokens for a single datacenter with multiple racks

DataStax recommends that each rack have the same number of nodes so you can alternate the rack assignments.

  1. Calculate the tokens:

    token-generator 8
    
    DC #1:
      Node #1:  -9223372036854775808
      Node #2:  -6917529027641081856
      Node #3:  -4611686018427387904
      Node #4:  -2305843009213693952
      Node #5:                     0
      Node #6:   2305843009213693952
      Node #7:   4611686018427387904
      Node #8:   6917529027641081856
  2. Assign the tokens to nodes on alternating racks in the cassandra-rackdc.properties or the cassandra-topology.properties file.

    Multiple Rack Tokens
    Alternating rack assignments

Calculating tokens for a multiple datacenter cluster

Do not use SimpleStrategy for this type of cluster. You must use the NetworkTopologyStrategy. This strategy determines replica placement independently within each datacenter.

Example:

  1. Calculate the tokens:

    token-generator 4 4
    
    DC #1:
      Node #1:  -9223372036854775808
      Node #2:  -4611686018427387904
      Node #3:                     0
      Node #4:   4611686018427387904
    DC #2:
      Node #1:  -4690182801719768975
      Node #2:    -78496783292381071
      Node #3:   4533189235135006833
      Node #4:   9144875253562394737
  2. After calculating the tokens, assign the tokens so that the nodes in each datacenter are evenly dispersed around the ring.

    Alternate tokens for multiple datacenters
    Token position and datacenter assignments
    Token Datacenter assignment
    legendmultipleDatacentersAlternateToken1

    Datacenter 1

    legendmultipleDatacentersAlternateToken2

    Datacenter 2

    Token position

    Token position

  3. Alternate the rack assignments as described above.

Calculating tokens when adding or replacing nodes/datacenters

To avoid token collisions, use the --ringoffset option.

  1. Calculate the tokens with the offset:

    token-generator 3 2 --ringoffset 100

    The results show the generated token values for the Murmur3Partitioner for one datacenter with 3 nodes and one datacenter with 2 nodes with an offset:

    DC #1:
      Node #1:   6148914691236517105
      Node #2:  12297829382473034310
      Node #3:  18446744073709551516
    DC #2:
      Node #1:   9144875253562394637
      Node #2:  18368247290417170445

    The value of the offset is for the first node and all other nodes are calculated for even distribution from the offset.

    The tokens without the offset are:

    token-generator 3 2
    
    DC #1:
      Node #1:  -9223372036854775808
      Node #2:  -3074457345618258603
      Node #3:   3074457345618258602
    DC #2:
      Node #1:    -78496783292381071
      Node #2:   9144875253562394737
  2. After calculating the tokens, assign the tokens so that the nodes in each datacenter are evenly dispersed around the ring and alternate the rack assignments.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com