Add nodes to vnode-enabled cluster
Steps to add nodes to a datacenter in an existing vnode-enabled cluster.
About this task
Virtual nodes (vnodes) greatly simplify adding nodes to an existing cluster:
-
Calculating tokens and assigning them to each node is no longer required.
-
Rebalancing the nodes within a datacenter is no longer necessary because a node joining the datacenter assumes responsibility for an even portion of the data.
For a detailed explanation about how vnodes work, see Virtual nodes.
If you do not use vnodes, see Adding single-token nodes to a cluster. This method is preferred, as it avoids the chance of having different configurations across nodes, and updates the seed lists across the entire cluster. |
When adding multiple nodes to the cluster using the allocation algorithm, ensure that nodes are added one at a time. If nodes are added concurrently, the algorithm assigns the same tokens to different nodes. |
Where is the cassandra.yaml
file?
The location of the cassandra.yaml
file depends on the type of installation:
Installation Type | Location |
---|---|
Package |
|
Tarball |
|
Where is the cassandra-rackdc.properties
file?
The location of the cassandra-rackdc.properties
depends on the type of installation:
Installation Type | Location |
---|---|
Package |
|
Tarball |
|
Where is the cassandra-topology.properties
file?
The location of the cassandra-topology.properties
file depends on the type of installation:
Installation Type | Location |
---|---|
Package |
|
Tarball |
|
Procedure
Be sure to use the same version of Hyper-Converged Database (HCD) on all nodes in the cluster, as described in the installation instructions.
-
Install HCD on the new nodes, but do not start HCD.
If your HCD installation started automatically, you must stop HCD on the node and clear the data.
-
Copy the snitch properties file from another node in the same center datacenter to the node you are adding.
-
cassandra-topology.properties
file is used by thePropertyFileSnitch
. Add an entry for the new node,<IP_address>=<dc_name>:<rack_name>
-
cassandra-rackdc.properties
file is used by theGossipingPropertyFileSnitch
,Ec2Snitch
,Ec2MultiRegionSnitch
, andGoogleCloudSnitch
. Adjust the rack number as required.
-
-
Set the following properties in the
cassandra.yaml
file:-
Dynamically allocating tokens based on the keyspace replication factors in the datacenter:
auto_bootstrap: true cluster_name: '<cluster_name>' listen_address: endpoint_snitch: <snitch_name> num_tokens: 8 allocate_tokens_for_local_replication_factor: <RF_number> seed_provider: - class_name: <seedprovider_name> parameters: - seeds: "<IP_address_list>"
For
<RF_number>
if the application keyspaces in the datacenter have different replication factors (RF), use the factor of the most data intensive application keyspace. When multiple application keyspaces with equal data intensity exist, use the highest RF.When adding multiple nodes alternate between the different RF.
-
Randomly assign tokens:
auto_bootstrap: true cluster_name: '<cluster_name>' listen_address: endpoint_snitch: <snitch_name> num_tokens: 128 seed_provider: - class_name: <seedprovider_name> parameters: - seeds: "<IP_address_list>"
Manually add the
auto_bootstrap
setting if it does not exist incassandra.yaml
. The other settings should exist in the defaultcassandra.yaml
file, ensure that you uncomment and set.Seed nodes cannot bootstrap. Make sure the new node is not listed in the
- seeds
list.Do not make all nodes seed nodes. See Internode communications (gossip).
-
-
Change any other non-default settings you have made to your existing cluster in the
cassandra.yaml
file andcassandra-topology.properties
orcassandra-rackdc.properties
files. Use the Unixdiff
command to find and merge any differences between existing and new nodes. -
Start the bootstrapping process for the new node.
-
Verify that the node is fully bootstrapped using
nodetool status
. All nodes should be up (UN) and not in any other state. -
After all new nodes are running, run
nodetool cleanup
on each of the previously existing nodes to remove the keys that no longer belong to those nodes. Wait for cleanup to complete on one node before runningnodetool cleanup
on the next node.Failure to run
nodetool cleanup
after adding a node may result in data inconsistencies including resurrection of previously deleted data.