Generating Tokens¶

This document corresponds to an earlier product version. Make sure you are using the version that corresponds to your version.

Latest Cassandra documentation | Earlier Cassandra documentation

Back to Table of Contents

All Documents List

Tokens assign a range of data to a particular node within a data center.

When you start a Cassandra cluster, data is distributed across the nodes in the cluster based on the row key using a partitioner. You must assign each node in a cluster a token and that token determines the node's position in the ring and its range of data. The tokens assigned to your nodes need to be distributed throughout the entire possible range of tokens (0 to 2¹²⁷ -1). Each node is responsible for the region of the ring between itself (inclusive) and its predecessor (exclusive). To illustrate using a simple example, if the range of possible tokens was 0 to 100 and you had four nodes, the tokens for your nodes should be 0, 25, 50, and 75. This approach ensures that each node is responsible for an equal range of data. When using more than one data center, each data center should be partitioned as if it were its own distinct ring.

Note

Each node in the cluster must be assigned a token before it is started for the first time. The token is set with the initial_token property in the cassandra.yaml configuration file.

Token Generating Tool¶

Cassandra includes a tool for generating tokens using the maximum possible range (0 to 2¹²⁷ -1) for use with the RandomPartitioner.

Usage¶

Packaged installs: token-generator <nodes_in_DC1> <nodes_in_DC2> ...
Binary installs: <install_location>/tools/bin/token-generator <nodes_in_DC1> <nodes_in_DC2> ...
Interactive Mode: Use token-generator without options and messages will guide you through the process.

The available options are:

Long Option Short Option	Description
--help -h	Show help.
--ringrange <RINGRANGE>	Specify a numeric maximum token value for your ring, if different from the default value of 2¹²⁷ -1.
--graph	Displays a rendering of the generated tokens as line segments in a circle, colored according to data center.
--nts -n	Optimizes multiple cluster distribution for NetworkTopologyStrategy (default).
--onts -o	Optimizes multiple cluster distribution for the OldNetworkTopologyStrategy.
--test -o	Run in test mode. Opens Firefox and displays an HTML file that shows various ring arrangements.

Examples¶

Generate tokens for nodes in a single data center:

./tools/bin/token-generator 4

Node #1:                                        0
Node #2:   42535295865117307932921825928971026432
Node #3:   85070591730234615865843651857942052864
Node #4:  127605887595351923798765477786913079296

Generate tokens for multiple data centers using NetworkTopologyStrategy (default):

./tools/bin/token-generator 4 4

DC #1:
  Node #1:                                        0
  Node #2:   42535295865117307932921825928971026432
  Node #3:   85070591730234615865843651857942052864
  Node #4:  127605887595351923798765477786913079296
DC #2:
  Node #1:  169417178424467235000914166253263322299
  Node #2:   41811290829115311202148688466350243003
  Node #3:   84346586694232619135070514395321269435
  Node #4:  126881882559349927067992340324292295867

Replica placement is independent within each data center.

Generate tokens for multiple racks in a single data center:

./tools/bin/token-generator 8

DC #1:
  Node #1:                                        0
  Node #2:   21267647932558653966460912964485513216
  Node #3:   42535295865117307932921825928971026432
  Node #4:   63802943797675961899382738893456539648
  Node #5:   85070591730234615865843651857942052864
  Node #6:  106338239662793269832304564822427566080
  Node #7:  127605887595351923798765477786913079296
  Node #8:  148873535527910577765226390751398592512

As a best practice, each rack should have the same number of nodes. This allows you to alternate the rack assignments: rack1, rack2, rack3, rack1, rack2, rack3, and so on:

Token Assignments when Adding Nodes¶

When adding nodes to a cluster, you must avoid token collisions. You can do this by offsetting the token values, which allows room for the new nodes.

The following graphic shows an example using an offset of +100:

Note

It is more important that the nodes within each data center manage an equal amount of data than the distribution of the nodes within the cluster. See balancing the load.

Table Of Contents

Previous topic

Next topic

This Page