Adding nodes to an existing cluster
Steps to add nodes when using virtual nodes.
Virtual nodes (vnodes) greatly simplify adding nodes to an existing cluster:
- Calculating tokens and assigning them to each node is no longer required.
- Rebalancing a cluster is no longer necessary because a node joining the cluster assumes responsibility for an even portion of the data.
For a detailed explanation about how vnodes work, see Virtual nodes.
Note: If you do not use vnodes, see Adding single-token nodes to a cluster.
Procedure
Be sure to use the same version
of Cassandra on all nodes in the cluster.
- Install Cassandra on the new nodes, but do not start Cassandra.
-
Depending on the snitch used in the cluster, set either the
properties in the cassandra-topology.properties
or the cassandra-rackdc.properties file:
- The PropertyFileSnitch uses the cassandra-topology.properties file.
- The GossipingPropertyFileSnitch, Ec2Snitch, Ec2MultiRegionSnitch, and GoogleCloudSnitch use the cassandra-rackdc.properties file.
The location of the cassandra-topology.properties file depends on the type of installation:Package installations /etc/cassandra/cassandra-topology.properties Tarball installations install_location/conf/cassandra-topology.properties The location of the cassandra-rackdc.properties file depends on the type of installation:Package installations /etc/cassandra/cassandra-rackdc.properties Tarball installations install_location/conf/cassandra-rackdc.properties -
Set the following properties in the
cassandra.yaml file:
- auto_bootstrap
- If this option has been set to false, you must set it to true. This option is not listed in the default cassandra.yaml configuration file and defaults to true.
- cluster_name
- The name of the cluster the new node is joining.
- listen_address/broadcast_address
- Can usually be left blank. Otherwise, use IP address or host name that other Cassandra nodes use to connect to the new node.
- endpoint_snitch
- The snitch Cassandra uses for locating nodes and routing requests.
- num_tokens
- The number of vnodes to assign to the node. If the hardware capabilities vary among the nodes in your cluster, you can assign a proportional number of vnodes to the larger machines.
- seed_provider
- Make sure that the new node lists at least one node in the existing
cluster. The -seeds list determines which nodes the
new node should contact to learn about the cluster and establish the
gossip process.Note: Seed nodes cannot bootstrap. Make sure the new node is not listed in the -seeds list. Do not make all nodes seed nodes. Please read Internode communications (gossip).
- Other non-default settings
- Change any other non-default settings you have made to your existing cluster in the cassandra.yaml file and cassandra-topology.properties or cassandra-rackdc.properties files. Use the diff command to find and merge any differences between existing and new nodes.
The location of the cassandra.yaml file depends on the type of installation:Package installations /etc/cassandra/cassandra.yaml Tarball installations install_location/resources/cassandra/conf/cassandra.yaml - Start the bootstrap node.
- Use nodetool status to verify that the node is fully bootstrapped and all other nodes are up (UN) and not in any other state.
-
After all new nodes are running, run nodetool
cleanup on each of the previously existing nodes to remove the keys
that no longer belong to those nodes. Wait for cleanup to complete on one node
before running nodetool cleanup on the next node.
Cleanup can be safely postponed for low-usage hours.