Restart the DSE cluster
DataStax Mission Control is currently in Public Preview. DataStax Mission Control is not intended for production use, has not been certified for production workloads, and might contain bugs and other functional issues. There is no guarantee that DataStax Mission Control will ever become generally available. DataStax Mission Control is provided on an “AS IS” basis, without warranty or indemnity of any kind. If you are interested in trying out DataStax Mission Control please join the Public Preview. |
Restart all nodes in a cluster in a rolling fashion to provide zero downtime to applications. Restart nodes in a DataStax Enterprise (DSE) cluster after adding or terminating one or more nodes.
Performance Impact
Running a restart task has no significant implications on the performance or stability of the cluster as long as the replication factor (RF) is set to 3
or higher. This RF setting tolerates one node being down while maintaining quorum read and write operations.
Prerequisites
-
The
kubectl
CLI tool. *Kubeconfig file
orcontext
pointing to aControl Plane
Kubernetes cluster.
Workflow of user and operators
-
User defines a
restart
CassandraTask
. -
User submits a
restart
CassandraTask
to theData Plane
Kubernetes cluster where the datacenter is deployed. -
DC-operator detects new task custom resource definition (CRD).
-
DC-operator iterates one rack at a time.
-
DC-operator triggers and monitors cleanup operations one pod at a time.
-
DC-operator reports task progress and status.
-
User requests a status report of the rolling restart
CassandraTask
with thekubectl
command, and views the status response.
Procedure
-
Define the
restart
CassandraTask
. This task manifest is in a YAML file, which defines the task and the command in thekind
section. The filename for this example isrestart-task.cassandratask.yaml
.In order to perform a rolling restart on an entire cluster, you must create a separate
restart
CassandraTask
per datacenter (DC). Restart nodes one datacenter at a time.Here is a sample:
apiVersion: control.k8ssandra.io/v1alpha1 kind: CassandraTask metadata: name: restart-dc spec: datacenter: name: dc1 namespace: demo jobs: - name: restart-dc1 command: restart
Key options:
-
metadata.name
: a unique identifer within the Kubernetes namespace where the task is submitted. While the name can be any value, consider including the datacenter name to prevent collision with other options. -
spec.datacenter
: a uniquenamespace
andname
combination used to determine which datacenter to target with this operation -
spec.jobs[0].command
: MUST berestart
for this operation -
Optional:
spec.jobs[0].args.keyspace_name
: restricts this operation to a particular keyspace. Omitting this value results in ALL keyspaces being restarted. By default all keyspaces are rebuilt.
-
-
Submit the
restart
CassandraTask
custom resource definition with thiskubectl
command:kubectl apply -f restart-task.cassandratask.yaml
Submit the
rebuild
CassandraTask
object to the Kubernetes cluster where the specified datacenter is deployed. -
Review the
restart
status with this command:kubectl get cassandratask restart-dc1 | yq .status
Upon completion of the
restart
task, the DC-level operators update or add the following fields in theCassandraTask
YAML file to indicate the status:-
either a
status.succeeded
orstatus.failed
field with a count of the pods that have been restarted and whether each restart was successful or unsuccessfulK8SSAND-1837 issue records an issue with these status fields not being reliably updated.
-
status.completionTime
field:timestamp
-
status.condition.status
:True
-
status.condition.type
:Complete
Sample output:
kind: CassandraTask metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"control.k8ssandra.io/v1alpha1","kind":"CassandraTask","metadata":{"annotations":{},"name":"restart-dc","namespace":"mission-control"},"spec":{"datacenter":{"name":"dc1","namespace":"mission-control"},"jobs":[{"command":"restart","name":"restart-dc1"}]}} creationTimestamp: "2022-10-15T21:24:35Z" generation: 2 labels: app.kubernetes.io/created-by: cass-operator app.kubernetes.io/instance: cassandra-demo app.kubernetes.io/managed-by: cass-operator app.kubernetes.io/name: cassandra app.kubernetes.io/version: 6.8.26 cassandra.datastax.com/cluster: demo cassandra.datastax.com/datacenter: dc1 control.k8ssandra.io/status: completed name: restart-dc namespace: mission-control ownerReferences: - apiVersion: cassandra.datastax.com/v1beta1 blockOwnerDeletion: true controller: true kind: CassandraDatacenter name: dc1 uid: 51561d2f-2d20-4a16-b90c-fe9b0655f1ba resourceVersion: "780428" uid: 2202880f-16f0-4b83-a5ce-ab3057cd6d70 spec: concurrencyPolicy: Forbid datacenter: name: dc1 namespace: mission-control jobs: - args: {} command: restart name: restart-dc1 restartPolicy: Never status: completionTime: "2022-10-15T21:36:54Z" conditions: - lastTransitionTime: "2022-10-15T21:36:54Z" status: "True" type: Complete startTime: "2022-10-15T21:24:35Z"
-