DataStax Enterprise Node Cleanup
DataStax Mission Control is currently in Public Preview. DataStax Mission Control is not intended for production use, has not been certified for production workloads, and might contain bugs and other functional issues. There is no guarantee that DataStax Mission Control will ever become generally available. DataStax Mission Control is provided on an “AS IS” basis, without warranty or indemnity of any kind. If you are interested in trying out DataStax Mission Control please join the Public Preview. |
The cleanup
operation runs nodetool cleanup
for either all or specific keyspaces on all nodes in the specified datacenter. Create the CassandraTask
that defines a cleanup
operation in the same Kubernetes cluster where the target CassandraDatacenter is deployed.
DataStax Enterprise does not automatically remove data from nodes that lose part of their partition range to a newly added node. After adding a node, use nodetool cleanup
on the source node and on neighboring nodes that shared the same subrange to prevent the database from including the old data in order to rebalance the load on that node. nodetool cleanup
temporarily increases disk space use proportional to the size of the largest SSTable and triggers Disk I/O.
Failure to run |
Performance Impact
This operation forces all SSTables to compact on a node evicting data that is no longer replicated to this node. As with all compactions this leads to an increase in disk operations and potential for latency. Depending on the amount of data present on the node and the query workload you may want to schedule this cleanup operation during off-peak hours.
Prerequisites
-
The
kubectl
CLI tool. -
Kubeconfig file
orcontext
pointing to aControl Plane
Kubernetes cluster.
Example
An existing Kubernetes cluster with one datacenter has 9 nodes (pods) distributed across 3 racks.
Workflow of user and operators
-
User defines a
cleanup
CassandraTask
. -
User submits a cleanup
CassandraTask
withkubectl
to theData Plane
Kubernetes cluster where the datacenter is deployed. -
DC-operator detects the new task custom resource definition (CRD).
-
DC-operator iterates one rack at a time.
-
DC-operator triggers and monitors cleanup operations one pod at a time.
-
DC-operator reports task progress and status.
-
User requests a status report of the cleanup
CassandraTask
with thekubectl
command, and views the status response.
Procedure
-
Modify the
cleanup-dc1.cassandratask.yaml
file.Here is a sample:
apiVersion: control.k8ssandra.io/v1alpha1 kind: CassandraTask metadata: name: cleanup-dc1 spec: datacenter: name: dc1 namespace: demo jobs: - name: cleanup-dc1 command: cleanup args: keyspace_name: my_keyspace
Key options:
-
metadata.name
: a unique identifer within the Kubernetes namespace where the task is submitted. While the name can be any value, consider including the cluster name to prevent collision with other options. -
spec.datacenter
: a uniquenamespace
andname
combination used to determine which datacenter to target with this operation. -
spec.jobs[0].command
: MUST becleanup
for this operation. -
Optional:
spec.jobs[0].args.keyspace_name
: restricts this operation to a particular keyspace. Omitting this value results in ALL keyspaces being cleaned up. By default all keyspaces are rebuilt.Although the
jobs
parameter is an array only one entry is permitted at this time. Specifying more than one job results in the task automatically failing.
-
-
Submit the
cleanup
CassandraTask
custom resource definition with thekubectl
command:kubectl apply -f cleanup-dc1.cassandratask.yaml
Submit the
cleanup
CassandraTask
object to the Kubernetes cluster where the specified datacenter is deployed.The DC-level operators perform a rolling cleanup operation, one node at a time. The order is determined lexicographically (aka Dictionary order), starting with rack names and then continuing with node (pod) names.
If a node is in process of being terminated and recreated, for whatever reason, as the cleanup operation is begun, the operation fails. In such an event, the DC-level operators retry the cleanup operation.
-
Monitor the cleanup operation progress with this
kubectl
command:kubectl get cassandratask cleanup-dc1 | yq .status
Sample output:
... status: completionTime: "2022-10-13T21:06:55Z" conditions: - lastTransitionTime: "2022-10-13T21:05:23Z" status: "True" type: Running - lastTransitionTime: "2022-10-13T21:06:55Z" status: "True" type: Complete startTime: "2022-10-13T21:05:23Z" succeeded: 9
The DC-level operators set the
startTime
field prior to starting thecleanup
operation. They update thecompletionTime
field when thecleanup
operation is completed.The sample output indicates that the task is completed with the
type: Complete
status condition set toTrue
. Thesucceeded: 9
field indicates that nine (9) nodes (or pods) completed the requested task successfully. Afailed
field tracks a running count of pods that failed thecleanup
operation.