Virtual node (vnode) configuration
Virtual nodes simplify many tasks in DataStax Enterprise (DSE), such as eliminating the need to determine the partition range, calculate and assign tokens, rebalance the cluster when adding or removing nodes, and replace dead nodes. For a complete description of virtual nodes and how they work, see Virtual nodes.
Guidelines for using virtual nodes
-
DSE requires the same token architecture on all nodes in a datacenter.
All the nodes must be either vnode-enabled or use a single-token architecture. Across the entire cluster, datacenter architecture can vary.
For example, a single cluster with:
-
A transaction-only datacenter running OLTP.
-
A single-token architecture search datacenter (no vnodes).
-
An analytics datacenter with vnodes.
-
-
DataStax recommends using 8 vnodes (tokens).
Restriction: DataStax recommends not using vnodes with DSE Search. However, if you decide to use vnodes with DSE Search, do not use more than 8 vnodes and ensure that you configure the
allocate_tokens_for_local_replication_factor
option in thecassandra.yaml
configuration file to match your environment.Using 8 vnodes distributes the workload between systems with a ~10% variance and has minimal impact on performance.
-
Ensure the correct vnode configuration with
cassandra.yaml
settings:-
When you add a vnode to an existing cluster or set up nodes in a new datacenter, set the target replication factor (RF) of keyspaces in the datacenter with the
allocate_tokens_for_local_replication_factor
option. -
The allocation algorithm distributes the token ranges proportionately using the
num_tokens
settings.All systems in the datacenter should have the same
num_tokens
settings unless the systems performance varies between systems.The allocation algorithm efficiently balances the workload using fewer tokens; when systems are added to a datacenter, the algorithm maintains the balance. Using a higher number of tokens more evenly distributes the workload, but also significantly increases token management overhead.
Set the number of vnode tokens based on the workload distribution requirements of your datacenter:
Replication factor 4 vnode (tokens) 8 vnode (tokens) 64 vnode (tokens) 128 vnode (tokens) 2
~17.5%
~12.5%
~3%
~1%
3
~14%
~10%
~2%
~1%
5
~11%
~7%
~1%
~1%
-
-
Add nodes to the cluster one at a time.
When adding multiple nodes to the cluster using the allocation algorithm, ensure that nodes are added one at a time. If nodes are added concurrently, the algorithm assigns the same tokens to different nodes.
Enable vnodes
In the cassandra.yaml
file:
-
Uncomment
num_tokens
and set the required number of tokens. -
Recommended: To use the allocation algorithm, uncomment the
allocate_tokens_for_local_replication_factor
option and set it to the target replication factor for the keyspaces in the datacenter. If the replication varies, alternate between the replication factor (RF) settings. -
Comment out the
initial_token
option or leave it unset.
To upgrade existing clusters to vnodes, see Enable virtual nodes on an existing production cluster.
Disable vnodes
If you do not use vnodes, you must make sure that each node is responsible for roughly an equal amount of data.
To ensure that each node is responsible for an equal amount of data, assign each node an |
In the cassandra.yaml
file:
-
Comment out the
num_tokens
andallocate_tokens_for_local_replication_factor
options. -
Uncomment the
initial_token
option and set it to 1 or to the value of a generated token for a multi-node cluster.