Terminate a datacenter within an existing DSE cluster

Use Mission Control to terminate one or more datacenters. Do this from an existing cluster, one at a time. With multiple datacenters, select the order in which to terminate based on DataStax Enterprise (DSE) rules. Refer to the DSE upgrading planning guide detailing the upgrade priorities. For instance, terminate DSE Analytics datacenters first, taking into account whether the nodes use DSE Hadoop or Spark. Second is the termination of DSE Graph or transactional datacenters, followed by datacenters running DSE Search nodes.

Mission Control manages the system keyspaces replication during the termination process without altering user-defined keyspaces. For instance, if a datacenter selected for termination still has user keyspaces replicated to it, then Mission Control blocks the termination until the keyspaces are manually altered. This is a safety measure to prevent unintended removal of data.

Example

Within an existing multi-datacenter Kubernetes cluster are two datacenters (DC1 in west region and DC2 in east region), each with 3 nodes. The decision is made to reduce costs and run from a single west region datacenter.

Workflow of user and operators

  1. User updates the replication strategy on user-defined keyspaces to remove references on the east region datacenter being terminated. User must wait to update east region keyspaces in use until those keyspaces are dormant.

    Mission Control operators automatically, and by default, update all the system keyspaces to terminate the east region datacenter and then terminate its nodes (pods).

    Mission Control issues an error if there are keyspaces actively using the datacenter in the east region that is targeted for termination.

  2. User submits the updated MissionControlCluster to the Control Plane Kubernetes cluster.

  3. Cluster-level operator picks up the modification and automatically update keyspace replication settings on system keyspaces.

  4. Cluster-level operator deletes datacenter-level resources in the Kubernetes cluster where the nodes are to be terminated.

  5. DC-level operator picks up datacenter-level resource changes and deletes native Kubernetes objects representing the DSE nodes.

    If a user-defined keyspace is still replicating to the DC that is targeted for termination then the operation FAILS. By design all user-defined keyspaces MUST NOT reference the DC to be terminated.

Terminate an existing cluster’s datacenter

  1. Modify the existing MissionControlCluster YAML (demo-dse.yaml) in the Control Plane Kubernetes cluster, updating the spec.datacenters list so that it no longer references the datacenter dc1 targeted for termination. In this example, the following lines are deleted:

    spec:
      k8ssandra:
        cassandra:
          datacenters:
          - metadata:
              name: dc1
            k8sContext: east
            size: 3
            racks:
            - name: rack1
              nodeAffinityLabels:
              topology.kubernetes.io/zone: us-east1-c
            - name: rack2
              nodeAffinityLabels:
                topology.kubernetes.io/zone: us-east1-b
            - name: rack3
              nodeAffinityLabels:
                topology.kubernetes.io/zone: us-east1-d
  2. Submit this modification to the Kubernetes Control Plane cluster with the following command:

    kubectl apply -f demo-dse.cassandratask.yaml

    The following keyspaces are updated:

    • system_traces

    • system_distributed

    • system_auth

    • dse_leases

    • dse_perf

    • dse_security

  3. Monitor the termination operation progress in the Control Plane cluster with this command:

    kubectl get k8ssandracluster demo

    In the event that any user-defined keyspaces are still replicating to datacenter targeted for termination, the kubectl command returns an error such as:

    NAME   ERROR
    demo   cannot decommission DC dc1: keyspace ks1 still has replicas on it

    To rectify this error, the replication of all user-defined keyspaces must be manually updated to remove references to the datacenter being terminated.

    ALTER KEYSPACE ks1 WITH replication = {'class': 'NetworkTopologyStrategy', 'west': 3};
  4. Monitor the termination progress by checking the status of the datacenter to be terminated in the east cluster. This example uses the following command:

    kubectl get cassandradatacenter dc1 -o yaml
    Sample results
    status:
      cassandraOperatorProgress: Updating
      conditions:
      - lastTransitionTime: "2022-10-24T02:43:20Z"
        message: ""
        reason: ""
        status: "True"
        type: Healthy
      - lastTransitionTime: "2022-10-24T02:43:20Z"
        message: ""
        reason: ""
        status: "False"
        type: Stopped
    ...

    The sample output indicates that one DSE node is online and one is not at this point in the monitoring. The CassandraDatacenter dc1 is terminated when DC-level operators set all of the Decommission conditions:status to "False". The nodeStatuses map is also updated.

    The DC-level operators must terminate each node (pod) in the datacenter before the datacenter itself is terminated.

  5. Monitor the DSE logs with this command:

    kubectl logs demo-dc1-rack3-sts-0 -c server-system-logger

    where demo-dc1-rack3-sts-0 is the StatefulSet designation of the ordinal index of the node in a rack.

    Sample results
    INFO  [pool-17-thread-1] 2022-10-27 17:13:09,717  StorageService.java:2143 - LEAVING: sleeping 30000 ms for batch processing and pending range setup
    ...
    INFO  [pool-17-thread-1] 2022-10-27 17:13:39,770  Gossiper.java:1301 - InetAddress /10.100.5.15 is now DOWN
    INFO  [pool-17-thread-1] 2022-10-27 17:13:39,788  StorageService.java:4968 - Announcing that I have left the ring for 30000ms

Upon successful completion of the east datacenter termination operation, users now run only in the west region datacenter.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com