Repair Cassandra clusters in Kubernetes

Repair Cassandra clusters in Kubernetes

This topic describes repair options for Apache Cassandra® clusters that are deployed in Kubernetes.

Use Cassandra Reaper to repair Cassandra clusters

For Cassandra clusters deployed and managed by Cass Operator in Kubernetes, use Cassandra Reaper to schedule and orchestrate repairs. This feature is applicable to Cassandra databases only.
Note: Reaper support started with the Cass Operator v1.3.0 release (30-June-2020). If you previously deployed the operator using earlier releases such as v1.2.0, apply the latest manifest YAML to your Kubernetes cluster from a local machine where you have an established connection. See Upgrade Cass Operator and related resources in Kubernetes.
Reaper improves the existing nodetool repair process for Cassandra by:
  • splitting repair jobs into smaller, tunable segments
  • handling back-pressure through monitoring running repairs and pending compactions
  • adding the ability to pause or cancel repairs and track progress precisely

Reaper provides a REST API, a command-line tool, and a web UI. See the Reaper documentation for instructions on how to build, install, and configure Reaper.

To enable Reaper for Cass Operator, update your datacenter configuration. You can edit a local copy and use kubectl apply, or invoke an editor to perform the changes directly in the Cassandra cluster that's running in Kubernetes. For the latter option, from your local machine where you've already established a connection with your Kubernetes cluster, enter a command such as:
kubectl -n cass-operator edit cassdc dc1
When the editor starts, add the following YAML under the spec: section:
(Recall that YAML indentation is important.)

After applying or saving the edited datacenter definition, Cass Operator applies the updated configuration dynamically to the target Cassandra cluster.

To check the pod Ready status after a few minutes:
kubectl -n cass-operator get pods
NAME                             READY   STATUS    RESTARTS   AGE
cass-operator-78884f4f84-25fx5   1/1     Running   0          25d
cluster1-dc1-default-sts-0       2/2     Running   0          21d
cluster1-dc1-default-sts-1       2/2     Running   0          21d
cluster1-dc1-default-sts-2       2/2     Running   0          2m48s
To check your YAML file, use a command such as:
kubectl -n cass-operator get pod/cluster1-dc1-default-sts-2 -o yaml
To view the log file following a YAML configuration update, use a command such as:
kubectl -n cass-operator logs cluster1-dc1-default-sts-2 server-system-logger 

Use NodeSync to repair DSE clusters

For DSE clusters deployed and managed by Cass Operator in Kubernetes, DataStax provides NodeSync, a continuous background repair service that is declarative and self-orchestrating.

After deploying a DSE cluster in your Kubernetes environment, use nodesync enable options on all new tables. The default setting is true.

For details, see nodesync enable.

What's next?

For information about upgrading Cass Operator and optionally using forceUpgradeRacks in the datacenter configuration, see the next topic.