Restart the cluster
Restart all nodes in a cluster in a rolling fashion to provide zero downtime to applications. Restart nodes in a DataStax Enterprise (DSE), Hyper-Converged Database (HCD), or Apache Cassandra® cluster after adding or terminating one or more nodes.
Create the CassandraTask or K8ssandraTask custom resource (CR) that defines a restart operation in the data plane Kubernetes cluster where the target cassandraDatacenter or MissionControlCluster resource and the datacenter are deployed.
Mission Control detects the restart task custom resource definition (CRD), iterates one rack at a time, and triggers and monitors restart operations one pod at a time. Starting with version 1.18.0, you can configure parallel restarts to restart all nodes in a rack simultaneously for faster operations. Mission Control reports task progress and status.
Performance impact
Running a restart task has no significant implications on the performance or stability of the cluster as long as the replication factor (RF) is set to 3 or higher.
This RF setting tolerates one node being down while maintaining quorum read and write operations.
Restart the cluster
Decide to restart one or multiple datacenters, or all datacenters in the cluster. You have granular control over choosing target datacenters and their assigned rack.
You can use the Mission Control UI or CLI to configure backup storage:
- Use the UI to restart one or more DCs
-
In the Home Clusters dialog, click the target cluster namespace.
-
In the Nodes section of the Overview tab:
-
Click the row checkbox for your target datacenter.
-
Click More Options on your target datacenter.
-
Click Restart.
-
To restart more than one datacenter, repeat substeps a, b, and c for each target datacenter.
The restart activity starts immediately.
To view notifications from the restart operation, see Monitor restart activity status.
-
- Use the UI to restart all DCs
-
In the Home Clusters dialog, click the target cluster namespace.
-
In the Nodes section of the Overview tab, click the Name checkbox. This selects all datacenter row checkboxes.
-
Click Bulk Actions.
-
In the Bulk Actions dialog:
-
Make sure that the Action type is Restart.
-
Make sure that All is your target datacenter selection.
In the Racks field, All is the default and only selection. -
Click Run.
The restart activity starts immediately.
To view notifications from the restart operation, see Monitor restart activity status.
-
- Use the CLI to restart clusters
-
Create the
restarttask manifest using eitherCassandraTaskorK8ssandraTask. This is a YAML configuration file that defines the task and the command in thekindsection.For
CassandraTask: You must create a separate restart task per datacenter.For
K8ssandraTask: You can specify multiple datacenters in a single task using thespec.datacenterslist.- CassandraTask example
-
The filename for this example is
restart-task.cassandratask.yaml.apiVersion: control.k8ssandra.io/v1alpha1 kind: CassandraTask metadata: name: restart-dc spec: datacenter: name: dc1 namespace: demo jobs: - name: restart-dc1 command: restartKey options:
-
metadata.name: A unique identifier within the Kubernetes namespace where the task is submitted. While the name can be any value, consider including the cluster and datacenter names to prevent collision with other actions. -
spec.datacenter: A uniquenamespaceandnamecombination used to determine which datacenter to target with this operation. -
spec.jobs[0].command: MUST berestartfor this operation.Submit the
restartCassandraTaskCRD with thiskubectlcommand:kubectl apply -f restart-task.cassandratask.yaml
-
- K8ssandraTask example
-
The filename for this example is
restart-task.k8ssandratask.yaml.apiVersion: control.k8ssandra.io/v1alpha1 kind: K8ssandraTask metadata: name: restart-dc spec: cluster: name: CLUSTER_NAME namespace: CLUSTER_NAMESPACE datacenters: - dc1 jobs: - name: restart-dc1 command: restartKey options:
-
metadata.name: A unique identifier within the Kubernetes namespace where the task is submitted. While the name can be any value, consider including the cluster and datacenter names to prevent collision with other actions. -
spec.cluster: A uniquenamespaceandnamecombination used to determine which cluster to target with this operation. -
spec.datacenters: A list of datacenter names to restart within the specified cluster. -
spec.jobs[0].command: MUST berestartfor this operation. -
Optional:
spec.maxConcurrentPods: Controls the number of pods that can restart concurrently. Set to a positive integer to limit concurrent restarts. Omit for sequential restarts (default).Submit the
restartK8ssandraTaskCRD with thiskubectlcommand:kubectl apply -f restart-task.k8ssandratask.yamlSubmit the restart task object to the Kubernetes cluster where the specified datacenter is deployed. To view notifications from the restart operation, see Monitor restart activity status.
-
Configure parallel rack restarts
By default, Mission Control restarts nodes sequentially within each rack to maintain cluster stability. For faster restart operations, you can configure Mission Control to restart all nodes in a rack in parallel in Mission Control version 1.18.0 and later.
Parallel rack restarts are useful when:
-
You need to minimize downtime during maintenance windows.
-
Your cluster has sufficient replication factor (RF ≥ 3) to tolerate multiple nodes being down simultaneously.
-
You want to expedite configuration changes that require a restart.
|
Parallel rack restarts temporarily reduce cluster availability within the affected rack. Ensure your replication factor is set to 3 or higher before enabling parallel restarts. |
To configure parallel rack restarts, use a K8ssandraTask manifest with the maxConcurrentPods parameter:
apiVersion: control.k8ssandra.io/v1alpha1
kind: K8ssandraTask
metadata:
name: restart-dc
spec:
cluster:
name: CLUSTER_NAME
namespace: CLUSTER_NAMESPACE
datacenters:
- dc1
maxConcurrentPods: 0
jobs:
- name: restart-dc1
command: restart
Set maxConcurrentPods to a positive integer to limit the number of pods that can restart concurrently.
Monitor restart activity status
You can use the Mission Control UI or CLI to monitor the status of the restart operation:
- Use the UI to monitor status
-
In the Mission Control UI main navigation, click Activities.
-
Review Status notifications for the progress of the restart activity.
A status of
SUCCESSindicates the restart operation completed without issue. Timestamps are issued for the Start and End of the restart activity.The Activities pane refreshes often and automatically.
- Use the CLI to monitor status
-
Review the
restartstatus with one of these commands:For CassandraTask:
kubectl get cassandratask restart-dc | yq .statusFor K8ssandraTask:
kubectl get k8ssandratask restart-dc | yq .statusUpon completion of the
restarttask, Mission Control updates or adds the following fields in the task YAML file to indicate the status:-
either a
status.succeededorstatus.failedfield with a count of the pods that have been restarted and whether each restart was successful or unsuccessful -
status.completionTimefield:timestamp -
status.condition.status:True -
status.condition.type:CompleteResultkind: CassandraTask metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"control.k8ssandra.io/v1alpha1","kind":"CassandraTask","metadata":{"annotations":{},"name":"restart-dc","namespace":"mission-control"},"spec":{"datacenter":{"name":"dc1","namespace":"mission-control"},"jobs":[{"command":"restart","name":"restart-dc1"}]}} creationTimestamp: "2022-10-15T21:24:35Z" generation: 2 labels: app.kubernetes.io/created-by: cass-operator app.kubernetes.io/instance: cassandra-demo app.kubernetes.io/managed-by: cass-operator app.kubernetes.io/name: cassandra app.kubernetes.io/version: 6.8.26 cassandra.datastax.com/cluster: demo cassandra.datastax.com/datacenter: dc1 control.k8ssandra.io/status: completed name: restart-dc namespace: mission-control ownerReferences: - apiVersion: cassandra.datastax.com/v1beta1 blockOwnerDeletion: true controller: true kind: CassandraDatacenter name: dc1 uid: 51561d2f-2d20-4a16-b90c-fe9b0655f1ba resourceVersion: "780428" uid: 2202880f-16f0-4b83-a5ce-ab3057cd6d70 spec: concurrencyPolicy: Forbid datacenter: name: dc1 namespace: mission-control jobs: - args: {} command: restart name: restart-dc1 restartPolicy: Never status: completionTime: "2022-10-15T21:36:54Z" conditions: - lastTransitionTime: "2022-10-15T21:36:54Z" status: "True" type: Complete startTime: "2022-10-15T21:24:35Z"
-