Replace a node
Replacing a node destroys it and its data, forcing a replacement node that is clean and empty.
Run this operation when a node is defective and you need to create a new node that is identical to the node being replaced.
Mission Control detects the replacenode
CassandraTask
custom resource definition (CRD), iterates one rack at a time, and triggers and monitors replacement operations one pod at a time. Mission Control reports task progress and status.
Performance impact
This operation results in the complete replacement of a node with a new and empty node. The new node contains no data, but it retains the same token range as the node it is replacing. In this situation the new node bootstraps rebuilding its data from the remaining replicas within the cluster. This results in some disk pressure while the replacement node bootstraps.
Prerequisites
-
A prepared environment on either bare-metal/VM or an existing Kubernetes cluster.
Replace a defective node
Choose User Interface (UI) or Command Line Interface (CLI) steps.
-
UI
-
CLI
-
In the Home Clusters dialog, click the target cluster namespace.
-
In the Nodes section of the Overview tab, click the row checkbox for your target node in its datacenter.
-
Click the overflow menu icon (3 dots) on your target node.
-
Click Replace.
The replace activity starts immediately.
To view notifications from the replace operation, see Monitor replace activity status.
-
Modify the
replace-node-task.cassandratask.yaml
file to define areplacenode
CassandraTask
.Here is a sample:
apiVersion: control.k8ssandra.io/v1alpha1 kind: CassandraTask metadata: name: replace-dc1 spec: datacenter: name: dc1 namespace: demo jobs: - name: replace-dc1 command: replacenode args: keyspace_name: KEYSPACE_NAME
Replace
KEYSPACE_NAME
with the name of the keyspace to rebuild.Key options:
-
metadata.name
: a unique identifier within the Kubernetes namespace where the task is submitted. While the name can be any value, consider including the cluster name to prevent collision with other options. -
spec.datacenter
: a uniquenamespace
andname
combination used to determine which datacenter to target with this operation. -
spec.jobs[0].command
: MUST bereplacenode
for this operation. -
Optional:
spec.jobs[0].args.keyspace_name
: restricts this operation to a particular keyspace. Omitting this value results in ALL keyspaces being replaced. By default all keyspaces are rebuilt.
-
-
Submit the
replacenode
CassandraTask
custom resource definition to theData Plane
Kubernetes cluster where the target datacenter and its node are deployed:kubectl apply -f replace-node-task.cassandratask.yaml
Mission Control detects and manages the modified
CassandraTask
custom resource definition (CRD). Mission Control stops the DSE node if it is running and then deletes the Persistent Volume(s) (PV). It then deletes the node (pod) where the DSE or Cassandra database is running. Mission Control deploys a new replacement node, starts it normally, and picks up the same token range as the deleted node.
Monitor replace activity status
Choose User Interface (UI) or Command Line Interface (CLI) steps.
-
UI
-
CLI
-
In the main navigation, click Activities.
-
See Status notifications regarding the progress of the replace activity.
A status of SUCCESS indicates the replace operation completed without issue. Timestamps are issued for the Start and End of the replace activity.
The Activities pane refreshes often and automatically.
-
Monitor the progress and view the status of the CassandraTask object by issuing this command in the
Control Plane
cluster:kubectl get cassandratask replace-node -o yaml
Sample results
... status: completionTime: "2022-11-01T03:28:33Z" conditions: - lastTransitionTime: "2022-11-01T03:28:12Z" status: "True" type: Running - lastTransitionTime: "2022-11-01T03:28:34Z" status: "False" type: Complete startTime: "2022-11-01T03:28:12Z" succeeded: 1
Mission Control sets the
startTime
field prior to starting thereplacenode
operation. It updates thecompletionTime
field when thereplacenode
operation is completed.The
status
field starts as"True"
and is set to"False"
when thereplacenode
operation completes. Thetype
field changes fromReplacingNodes
toComplete
when thereplacenode
operation completes.