Replace a Node
DataStax Mission Control is currently in Public Preview. DataStax Mission Control is not intended for production use, has not been certified for production workloads, and might contain bugs and other functional issues. There is no guarantee that DataStax Mission Control will ever become generally available. DataStax Mission Control is provided on an “AS IS” basis, without warranty or indemnity of any kind. If you are interested in trying out DataStax Mission Control please join the Public Preview. |
Replacing a node destroys it and its data, forcing a replacement node that is clean and empty.
Run this operation when a node is defective and you need to create a new node that is identical to the node being replaced.
Performance Impact
This operation results in the complete replacement of a node with a new pod containing NO DATA, but owning the same token range as the node it is replacing. In this situation the new node bootstraps rebuilding its data from the remaining replicas within the cluster. This results in some disk pressure while the replacement node bootstraps.
Prerequisites
-
The
kubectl
CLI tool. *Kubeconfig file
orcontext
pointing to aControl Plane
Kubernetes cluster.
Workflow of user and operators
-
User defines a
replacenode
CassandraTask
. -
User submits a
replace node
CassandraTask
to theData Plane
Kubernetes cluster where the datacenter is deployed. -
DC-operator detects new task custom resource definition (CRD).
-
DC-operator iterates one rack at a time.
-
DC-operator triggers and monitors replacement operations one pod at a time.
-
DC-operator reports task progress and status.
-
User requests a status report of the
replace-node
CassandraTask
with thekubectl
command, and views the status response.
Procedure
-
Modify the
replace-node-task.cassandratask.yaml
file to define areplacenode
CassandraTask
.Here is a sample:
apiVersion: control.k8ssandra.io/v1alpha1 kind: CassandraTask metadata: name: replace-dc1 spec: datacenter: name: dc1 namespace: demo jobs: - name: replace-dc1 command: replacenode args: keyspace_name: my_keyspace
Key options:
-
metadata.name
: a unique identifer within the Kubernetes namespace where the task is submitted. While the name can be any value, consider including the cluster name to prevent collision with other options. -
spec.datacenter
: a uniquenamespace
andname
combination used to determine which datacenter to target with this operation. -
spec.jobs[0].command
: MUST bereplacenode
for this operation. -
Optional:
spec.jobs[0].args.keyspace_name
: restricts this operation to a particular keyspace. Omitting this value results in ALL keyspaces being replaced. By default all keyspaces are rebuilt.
-
-
Submit the
replacenode
CassandraTask
custom resource definition to theData Plane
Kubernetes cluster with this command:kubectl apply -f replace-node-task.cassandratask.yaml
Submit the
replacenode
CassandraTask
to the Kubernetes cluster where the specified datacenter is deployed.DC-level operators manage this CassandraTask. They stop the DSE node if it is running and then delete the Persistent Volume(s) (PV). Next they delete the node (pod) in which DSE is running. A new node is deployed as its replacement, and is started normally, picking up the same token range as the previous node.
-
Monitor the node replacement progress with this command:
-
Kubectl command
-
Sample output
kubectl get cassandratask replace-node | yq .status
... - lastTransitionTime: "2022-11-01T03:28:12Z" message: "" reason: "" status: "True" type: ReplacingNodes
The
status
field is set to "False" when thereplacenode
operation completes. -
-
Monitor the progress and view the status of the CassandraTask object by issuing this command in the
Control Plane
cluster:-
Kubectl command
-
Sample output
kubectl get cassandratask replace-node -o yaml
... status: completionTime: "2022-11-01T03:28:33Z" conditions: - lastTransitionTime: "2022-11-01T03:28:12Z" status: "True" type: Running - lastTransitionTime: "2022-11-01T03:28:34Z" status: "True" type: Complete startTime: "2022-11-01T03:28:12Z" succeeded: 1
The DC-level operators set the
startTime
field prior to starting thereplacenode
operation. It updates thecompletionTime
field when therebuild
operation is completed. -