Add single-token nodes to a cluster
Steps for adding nodes in single-token architecture clusters, not clusters using Virtual nodes.
To add capacity to a cluster, introduce new nodes in stages or by adding an entire datacenter. Use one of the following methods:
-
Add capacity by doubling the cluster size: Adding capacity by doubling (or tripling or quadrupling) the number of nodes is less complicated when assigning tokens. Using this method, existing nodes keep their existing token assignments, and the new nodes are assigned tokens that bisect (or trisect) the existing token ranges.
-
Add capacity for a non-uniform number of nodes: When increasing capacity with this method, you must recalculate tokens for the entire cluster, and assign the new tokens to the existing nodes.
|
Only add new nodes to the cluster: A new node is a system that DSE has never started. The node must have absolutely no previous data in the Data loss or corruption can occur if you add a node that was previously used for testing or moved from another cluster because the older data is merged into the existing cluster data. |
-
Calculate the tokens for the nodes based on your expansion strategy using the Token Generating Tool.
-
Install DataStax Enterprise and configure DataStax Enterprise (DSE) on each new node.
-
If DSE starts automatically, stop the node and clear the data.
-
Configure
cassandra.yamlon each new node:-
auto_bootstrap: Iffalse, set it totrue. This option is not explicitly set in the defaultcassandra.yamlconfiguration file, and it defaults totrue. -
listen_address/broadcast_address: Leave blank or use the IP address or host name that other nodes use to connect to the new node. -
initial_token: Set according to your token calculations.If this property has no value, the database assigns the node a random token range, which results in a badly unbalanced ring.
-
seed_provider: Make sure that the new node lists at least one seed node in the existing cluster.Seed nodes cannot bootstrap. Make sure the new nodes are not listed in the
- seedslist.Do not make all nodes seed nodes. See Internode communications (gossip).
-
Change any other non-default settings in the new nodes to match the existing nodes. Use the
diffcommand to find and merge any differences between the nodes.
-
-
Depending on the snitch, assign the datacenter and rack names in the
cassandra-topology.propertiesorcassandra-rackdc.propertiesfor each node. -
Start DSE on each new node in two minutes intervals with
consistent.rangemovementturned off:-
Package installations: On each bootstrapped node, add the following option to the
jvm-server.optionsfile, and then start DSE:-Dcassandra.consistent.rangemovement=false -
Tarball installations: Start DSE with the following command:
bin/cassandra -Dcassandra.consistent.rangemovement=false
-
-
Run the following operations during a low-usage time because they are resource intensive:
-
After the new nodes are fully bootstrapped, use
nodetool moveto assign the newinitial_tokenvalue to each node that requires one, one node at a time. -
After all nodes have their new tokens assigned, run
nodetool cleanupon each node in the cluster, and then wait for cleanup to complete on each node before cleaning the next node.This step removes the keys that no longer belong to the previously existing nodes.
Failure to run
nodetool cleanupafter adding a node can result in data inconsistencies, including resurrection of previously deleted data.
-