Adding nodes to an existing cluster
Steps to add nodes when using virtual nodes.
Virtual nodes (vnodes) greatly simplify adding nodes to an existing cluster:
- Calculating tokens and assigning them to each node is no longer required.
- Rebalancing a cluster is no longer necessary because a node joining the cluster assumes responsibility for an even portion of the data.
For a detailed explanation about how vnodes work, see Virtual nodes.
Procedure
- Install Cassandra on the new nodes, but do not start Cassandra.
-
Set the following properties in the
cassandra.yaml and, depending on the snitch,
the cassandra-topology.properties or
cassandra-rackdc.properties configuration
files:
- auto_bootstrap - This property is not listed in the default cassandra.yaml configuration file, but it might have been added and set to false by other operations. If it is not defined in cassandra.yaml, Cassandra uses true as a default value. For this operation, search for this property in the cassandra.yaml file. If it is present, set it to true or delete it..
- cluster_name - The name of the cluster the new node is joining.
- listen_address/broadcast_address - Can usually be left blank. Otherwise, use IP address or host name that other Cassandra nodes use to connect to the new node.
- endpoint_snitch - The snitch Cassandra uses for locating nodes and routing requests.
- num_tokens - The number of vnodes to assign to the node. If the hardware capabilities vary among the nodes in your cluster, you can assign a proportional number of vnodes to the larger machines.
- seeds - Determines
which nodes the new node contacts to learn about the cluster and establish
the gossip process. Make sure that the -seeds list includes
the address of at least one node in the existing cluster.Note: This new node will not bootstrap if it is listed as a seed node. Make sure the new node's address is not listed in the -seeds list. For more information about seed nodes, see Internode communications (gossip).
To add the new node as a seed node, complete these steps, then go on to Promoting a new node to a seed node.
- Check the
cassandra.yaml file and
cassandra-topology.properties or
cassandra-rackdc.properties files in other nodes in
the cluster for any non-default settings, and make sure to replicate these
settings on the new node.Note: Use the diff command to find and merge (by head) any differences between existing and new nodes.
The location of the cassandra-topology.properties file depends on the type of installation:Package installations /etc/cassandra/cassandra-topology.properties Tarball installations install_location/conf/cassandra-topology.properties The location of the cassandra-rackdc.properties file depends on the type of installation:Package installations /etc/cassandra/cassandra-rackdc.properties Tarball installations install_location/conf/cassandra-rackdc.properties The location of the cassandra.yaml file depends on the type of installation:Package installations /etc/cassandra/cassandra.yaml Tarball installations install_location/resources/cassandra/conf/cassandra.yaml Warning: Simultaneously bootstrapping more than one new node from the same rack, violates LOCAL_QUORUM constraints. Data may stream from any replica in order to put data onto the new nodes, including other new nodes. Adding two or more nodes at the same time is possible but not recommended; it may introduce consistency issues. To assess the risks to your environment, see JIRA issues CASSANDRA-2434 and CASSANDRA-7069.If you are adding two or more nodes, configure each node as in the previous steps. Then go to Starting multiple new nodes for additional steps you must take.
-
Start the single node:
- Package installations : start Cassandra as a service
- Tarball installations: start Cassandra as a process
- Use nodetool status to verify that the node is fully bootstrapped and all other nodes are up (UN) and not in any other state.
-
After all new nodes are running, run nodetool
cleanup on each of the previously existing nodes to remove the keys
that no longer belong to those nodes. Wait for cleanup to complete on one node
before running nodetool cleanup on the next node.
Cleanup can be safely postponed for low-usage hours.
What's next
Starting multiple new nodes
If you have added more than one node:
- Make sure you start each node with consistent.rangemovement
property turned off:
- Package installations
- On each of the nodes you are bootstrapping, add the following option
to the /usr/share/cassandra/cassandra-env.sh
file:
JVM_OPTS="$JVM_OPTS -Dcassandra.consistent.rangemovement=false
- Tarball installations
- Start Cassandra on each of the nodes you are bootstrapping with this
option:
$ bin/cassandra -Dcassandra.consistent.rangemovement=false
- Allow two minutes between node startups.
- After each new node has bootstrapped, turn consistent range movement back on for
each one:
- Package installations
- Stop Cassandra and
remove the line you added to
/usr/share/cassandra/cassandra-env.sh in
the previous
step:
JVM_OPTS="$JVM_OPTS -Dcassandra.consistent.rangemovement=false
Then restart Cassandra.
- Tarball installations
- Stop Cassandra, then
restart with this
option:
$ bin/cassandra -Dcassandra.consistent.rangemovement=false
- After restarting the nodes, go to back to step 4 above to verify the new nodes.
What's next
Promoting a node as a seed node
A seed node does not bootstrap, so a new node can't be configured as one immediately. After you have bootstrapped new nodes in the cluster, follow these steps for each one you want to promote as a seed node.
- Stop Cassandra on the node you want to promote.
- Open the node's cassandra.yaml file and add the node's address to the seed_provider list.
- Make this change on all other nodes in the cluster.
- Start Cassandra as a service or a stand-alone process.