Create a Multi-token DSE Cluster
DataStax Mission Control is currently in Public Preview. DataStax Mission Control is not intended for production use, has not been certified for production workloads, and might contain bugs and other functional issues. There is no guarantee that DataStax Mission Control will ever become generally available. DataStax Mission Control is provided on an “AS IS” basis, without warranty or indemnity of any kind.
If you are interested in trying out DataStax Mission Control please join the Public Preview.
Availability requirements might dictate the need to create a cluster with a low number of virtual nodes (vnodes) per node and a token per node. Use DataStax Mission Control to assist in assigning tokens as you create a cluster and to help assure a more balanced cluster. A large number of tokens spreads the effects of a single node going down across multiple nodes in the cluster. However, this also leads to increased operational overhead with repairs, analytics workloads, and search. Setting the appropriate number of tokens during cluster creation is imperative. Changing this value is difficult after a cluster is created.
This example steps you through creating a cluster with four vnodes and four (4) tokens. where the number of vnodes per DSE host is less than 16.
This example works with a two datacenter cluster with four vnodes per DSE node, 3 racks per datacenter, and 18 nodes total.
Modify the MissionControlCluster manifest to explicitly override the default of
16 in the config:cassandraYaml:num_tokens section. From this definition, DataStax Mission Control automatically generates initial tokens.
Required: Modify the MissionControlCluster manifest, explicitly specifying
config.cassandraYaml.num_tokens: 4, as follows:
apiVersion: missioncontrol.datastax.com/v1beta1 kind: MissionControlCluster metadata: name: demo spec: k8ssandra: cassandra: serverVersion: 6.8.25 config: cassandraYaml: num_tokens: 4 datacenters: - metadata: name: dc1 k8sContext: data-plane-1 size: 9 racks: - name: rack1 - name: rack2 - name: rack3 - metadata: name: dc2 k8sContext: data-plane-2 size: 9 racks: - name: rack1 - name: rack2 - name: rack3
This change necessarily overrides the default number of tokens (
16) that DSE sets per node. Using this modified definition, DataStax Mission Control assigns 4 tokens to the first 3 nodes to bootstrap in each datacenter. The remaining nodes choose good token values using the built-in token allocation algorithm of DSE.
With single-token or few-token clusters, always try to size the datacenter with a number that is an exact multiple of the number of racks. This facilitates the token allocation and results in a better token balance.
When using multiple racks, each rack is expected to replicate 100% of the data. Therefore, the number of racks must be equal to or greater than the replication factor (RF) in the datacenter, (RF=
3by default). Failure to comply with this requirement prevents some hosts from deploying.
Issue the following command from a pod running DSE in the cluster and review the resulting cluster:
kubectl exec demo-dc1-rack1-sts-0 -c cassandra -- nodetool -u demo-superuser -pw <omitted> ring
Datacenter: dc1 ========== Address Rack Status State Load Owns Token 8710962479251732601 10.100.7.11 rack1 Up Normal 195.28 KiB 37.04% -9223372036854775808 10.100.10.16 rack3 Up Normal 224.16 KiB 33.33% -8710962479251732915 10.100.16.17 rack2 Up Normal 235.68 KiB 29.63% -8540159293384051146 [...] 10.100.4.11 rack2 Up Normal 227.99 KiB 33.33% 8198552921648689399 10.100.6.11 rack1 Up Normal 237.61 KiB 29.63% 8369356107516371168 10.100.0.12 rack3 Up Normal 232.3 KiB 29.63% 8710962479251732601 Datacenter: dc2 ========== Address Rack Status State Load Owns Token 9144875253562394737 10.100.15.9 rack2 Up Normal 208.28 KiB 33.33% -8789459262544113986 10.100.14.7 rack1 Up Normal 203.83 KiB 29.63% -8618656076676432217 10.100.9.6 rack3 Up Normal 209.17 KiB 29.63% -8277049704941070781 [...] 10.100.9.6 rack3 Up Normal 209.17 KiB 29.63% 8290859324223990098 10.100.17.7 rack2 Up Normal 201.02 KiB 29.63% 8632465695959351530 10.100.11.12 rack3 Up Normal 206.97 KiB 37.04% 9144875253562394737
When using racks and vnodes, the Owns column is understood as follows:
Although each line corresponds to a vnode, this column reports the effective data ownership for the entire host (and not just the vnode), within its rack (and not within the datacenter). For example, in
rack1, 100% of the data is replicated; in this rack, host
37.04%of that total data.
In this example of three (3) physical hosts per rack, in order to get a perfectly balanced cluster, where each rack is also well balanced, the expected ownership for any physical host would be 33% of the data within the rack. Instead, note that the ownership distribution varies a little. This is because in complex cases such as this, the token allocation algorithm may not always achieve the ideal balance, but assigns an acceptible ownership variance. In this case the data ownership variance for a host is
29.63% minimum and