Interact with local operators during a control plane outage

When the Mission Control control plane is unavailable, you can still interact with the data plane’s local operators to manage your database clusters.

The cass-operator manages individual database datacenters in the data plane. When the control plane is down, you can interact with this operator directly to troubleshoot and recover at the datacenter level.

Prerequisites

  • A backup of your current cluster configuration

  • Access to the data plane of the Mission Control cluster

  • Permissions to manage Kubernetes resources

  • kubectl version 1.20 or later installed on your local machine

Verify control plane availability

Before proceeding, verify if the control plane is unavailable:

kubectl cluster-info
Sample result
Unable to connect to the server: dial tcp SERVER_IP:8443: i/o timeout

Manage custom resources in the data plane

Each data plane runs its own Kubernetes API server, independent of the Mission Control control plane. You can interact with this API directly using standard Kubernetes tools:

  • kubectl configured with the data plane’s kubeconfig file

  • Kubernetes API via REST calls

The Mission Control UI usually creates Custom Resources (CRs), and these resources reside in the data plane. When the control plane is unavailable, you must manage these resources directly. For more information on CRs, see the Mission Control Custom Resource Definition (CRD) reference.

Change the context to the data plane

If the control plane is unavailable, change the context to the data plane using the kubeconfig file.

List all available contexts:

kubectl config get-contexts

Change the context to the data plane:

kubectl config use-context DATA_PLANE_CLUSTER_NAME

Replace DATA_PLANE_CLUSTER_NAME with the name of the data plane cluster.

View details of a CR

Describe a specific CR to view its details:

kubectl get CUSTOM_RESOURCE_KIND RESOURCE_NAME -n NAMESPACE -o yaml

Replace the following:

  • CUSTOM_RESOURCE_KIND: The kind of custom resource

  • RESOURCE_NAME: The name of the resource

  • NAMESPACE: The namespace where the resource is deployed

Modify a CR

You can edit a CR to make changes:

kubectl edit CUSTOM_RESOURCE_KIND RESOURCE_NAME -n NAMESPACE

Replace the following:

  • CUSTOM_RESOURCE_KIND: The kind of custom resource

  • RESOURCE_NAME: The name of the resource

  • NAMESPACE: The namespace where the resource is deployed

Apply changes to a CR

To apply changes to a CR, use the kubectl apply command:

kubectl apply -f CUSTOM_RESOURCE.yaml

Replace CUSTOM_RESOURCE.yaml with the YAML file containing the changes.

Delete a CR

To delete a CR, use the kubectl delete command:

kubectl delete CUSTOM_RESOURCE_KIND RESOURCE_NAME -n NAMESPACE

Replace the following:

  • CUSTOM_RESOURCE_KIND: The kind of custom resource

  • RESOURCE_NAME: The name of the resource

  • NAMESPACE: The namespace where the resource is deployed

Manage database datacenter resources

When the control plane is down, you can only work with the cass-operator in the data plane, which manages individual datacenters. You can’t access K8ssandraCluster level functionality during control plane outages.

During a control plane outage:

  • Multi-datacenter operations are not available

  • Automated backup schedules may be affected

  • Monitoring and alerting capabilities may be limited

  • Changes may need manual reconciliation after recovery

The system ignores any changes to K8ssandraCluster objects when the control plane is down. Additionally, if your deployment uses a single Reaper installation managed by the control plane, you can’t access Reaper functionality during an outage.

Available cass-operator resources

The cass-operator manages the following resources at the datacenter level:

  • CassandraDatacenter: Defines individual datacenters and their configurations, including size, rack definitions, and storage

  • CassandraTask: Defines maintenance tasks for database datacenters

Update CassandraDatacenter resources

List all CassandraDatacenter resources:

kubectl get cassandradatacenter -A

View details of a CassandraDatacenter resource:

kubectl describe cassandradatacenter DATACENTER_NAME -n NAMESPACE

Replace the following:

  • DATACENTER_NAME: The name of the CassandraDatacenter

  • NAMESPACE: The namespace where the CassandraDatacenter is deployed

When the control plane becomes available again, your direct changes to the CassandraDatacenter resource will remain active until either:

  1. A new version is created on the control plane, or

  2. An annotation is placed on the MissionControlCluster object that explicitly allows overwriting the local changes. The cassandra.datastax.com/autoupdate-spec annotation controls this behavior. Use either always or once as values.

It’s important to document any manual changes made during the outage to ensure they are properly incorporated when the control plane is restored.

Modify a CassandraDatacenter resource:

kubectl edit cassandradatacenter DATACENTER_NAME -n NAMESPACE

Replace the following:

  • DATACENTER_NAME: The name of the CassandraDatacenter

  • NAMESPACE: The namespace where the CassandraDatacenter is deployed

Trigger a rolling restart

To trigger a rolling restart, you must create a CassandraTask resource with the restart command. You can restart the entire datacenter or add an argument to restart a specific rack.

Example 1: Restart a datacenter
apiVersion: control.k8ssandra.io/v1alpha1
kind: CassandraTask
metadata:
  name: restart-task
spec:
  datacenter:
    name: DATACENTER_NAME
    namespace: cass-operator
  jobs:
    - name: JOB_NAME
      command: restart

Replace the following:

  • DATACENTER_NAME: The name of the CassandraDatacenter to restart

  • JOB_NAME: The name of the job

Example 2: Restart a specific rack
apiVersion: control.k8ssandra.io/v1alpha1
kind: CassandraTask
metadata:
  name: restart-task
spec:
  datacenter:
    name: DATACENTER_NAME
    namespace: cass-operator
  jobs:
    - name: JOB_NAME
      command: restart
      args:
        - rack: RACK_NAME

Replace the following:

  • DATACENTER_NAME: The name of the CassandraDatacenter to restart

  • JOB_NAME: The name of the job

  • RACK_NAME: The name of the rack to restart

Apply the restart task:

kubectl apply -f RESTART_TASK_FILENAME.yaml

Replace RESTART_TASK_FILENAME.yaml with the name of the restart task file.

For more information on CassandraDatacenter resources, see the CassandraDatacenter CRD reference in the K8ssandra documentation.

Create a CassandraTask

CassandraTask resources define maintenance tasks for database clusters, such as rebuilds and restarts. You can create these tasks directly. Supported tasks include:

  • rebuild: Rebuild a node

  • cleanup: Cleanup a node

  • restart: Restart a node

  • replacenode: Replace a node

  • upgradesstables: Upgrade SSTables

  • scrub: Scrub a node

  • compaction: Compact a node

  • move: Move a node

  • flush: Flush a node

  • garbagecollect: Garbage collect a node

  • refresh: Refresh a node

For example, to create a task to replace a node, you can use the following YAML:

apiVersion: control.k8ssandra.io/v1alpha1
kind: CassandraTask
metadata:
  name: REPLACE_TASK_FILE_NAME
spec:
  datacenter:
    name: DATACENTER_NAME
    namespace: cass-operator
  jobs:
    - name: JOB_NAME
      command: replacenode
      args:
        pod_name: POD_NAME

Replace the following:

  • REPLACE_TASK_FILE_NAME: The name of the replace task file

  • DATACENTER_NAME: The name of the CassandraDatacenter

  • JOB_NAME: The name of the job

  • POD_NAME: The name of the pod to replace

Apply the replace task:

kubectl apply -f REPLACE_TASK_FILENAME.yaml

Replace REPLACE_TASK_FILENAME.yaml with the name of the replace task file.

For more information on CassandraTask resources, see the CassandraTask CRD reference in the K8ssandra documentation.

Best practices

Follow these best practices to ensure a smooth recovery process:

Before making changes

  • Document all manual changes

  • Create backups of critical resources

  • Test changes in a non-production environment if possible

During the outage

  • Make only necessary changes

  • Keep detailed logs of all modifications

  • Coordinate changes with team members

After control plane recovery

  • Verify all changes are properly synchronized

  • Update documentation

  • Review logs for any inconsistencies

Recovery procedures

After the control plane is restored, verify the following:

  • Control plane connectivity

  • Changes made during the outage

  • Synchronize configurations if needed

  • Test cluster functionality

  • Update documentation with any permanent changes

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com