• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

DataStax Project Mission Control

    • Overview
      • Release notes
      • FAQs
      • Getting support
    • Installing DataStax Mission Control
      • Planning your install
      • Server-based Runtime Installer
        • Services setup with DataStax Mission Control Runtime Installer
      • Bring your own Kubernetes
        • Installing Control Plane
        • Installing Data Plane
    • Migrating
      • Migrating DSE Cluster to DataStax Mission Control
    • Managing
      • Managing DSE clusters
        • Configuring DSE
          • Authentication
          • Authorization
          • Securing DSE
          • DSE Unified Authorization
        • Cluster lifecycle
          • Creating a cluster
          • Creating a single-token cluster
          • Creating a multi-token cluster
          • Terminating a DSE cluster
          • Upgrading a DSE cluster
        • Datacenter lifecycle
          • Adding a DSE datacenter
          • Terminating a DSE datacenter
        • Node lifecycle
          • Adding DSE nodes
          • Terminating DSE nodes
          • Using per-node configurations
      • Managing DataStax Mission Control infrastructure
        • Adding a node to DataStax Mission Control clusters
        • Terminating a node from DataStax Mission Control clusters
        • Storage classes defined
      • Managing DataStax Mission Control resources
        • Accessing Admin Console
        • Configuring DataStax Mission Control
        • Generating a support bundle
    • Operating on DSE Clusters
      • Cleanup
      • Rebuilding
      • Replacing a node
      • Rolling restart
      • Upgrading SSTables
    • Reference
      • DSECluster manifest
      • CassandraTask manifest
  • DataStax Project Mission Control
  • Operating on DSE Clusters
  • Rebuilding

Rebuild a Datacenter’s Replicas

DataStax Mission Control is current in Private Preview. It is subject to the beta agreement executed between you and DataStax. DataStax Mission Control is not intended for production use, has not been certified for production workloads, and might contain bugs and other functional issues. There is no guarantee that DataStax Mission Control will ever become generally available. DataStax Mission Control is provided on an “AS IS” basis, without warranty or indemnity of any kind.

If you are interested in trying out DataStax Mission Control please contact your DataStax account team.

A rebuild operation runs the nodetool rebuild command, rebuilding data on a single node by streaming from another (source) datacenter (DC). Run this operation on each node after defining a source datacenter from which to stream data. This command streams data from another datacenter to rebuild the local DC’s replicas.

Redistribute data on remaining nodes after one or more nodes or datacenters are terminated or added.

Performance Impact

Rebuilding nodes ensures high availability of data and avoids single points of failure.

Prerequisites

  • The kubectl CLI tool.

  • A Kubeconfig file or context pointing to a Control Plane Kubernetes cluster.

Example

A single datacenter (dc1) is deployed on the Data Plane Kubernetes cluster. A Control Plane Kubernetes cluster exists.

Workflow of user and operators

  1. Perform a backup of the node.

  2. User defines the rebuild CassandraTask. Specify the affected node as offline.

  3. User submits a rebuild CassandraTask to the Data Plane Kubernetes cluster where the datacenter is deployed.

  4. DC-operator detects new task custom resource definition (CRD).

  5. DC-operator iterates one rack at a time.

  6. DC-operator triggers and monitors rebuild operations one pod at a time.

  7. DC-operator reports task progress and status.

  8. User requests a status report of the rebuild CassandraTask with the kubectl command, and views the status response.

Procedure

  1. Modify the rebuild-dc1.cassandratask.yaml file.

    Here is a sample:

    apiVersion: control.k8ssandra.io/v1alpha1
    kind: CassandraTask
    metadata:
      name: rebuild-dc1
    spec:
      datacenter:
        name: dc1
        namespace: demo
      jobs:
        - name: rebuild-dc1
          command: rebuild
          args:
            keyspace_name: my_keyspace

    Key options:

    • metadata.name: a unique identifer within the Kuberbetes namespace where the task is submitted. While the name can be any value, consider including the cluster name to prevent collision with other options.

    • spec.datacenter: a unique namespace and name combination used to determine which datacenter is the source for this operation.

    • spec.jobs[0].command: MUST be rebuild for this operation.

    • Optional: spec.jobs[0].args.keyspace_name: restricts this operation to a particular keyspace. Omitting this value results in ALL keyspaces being rebuilt. By default all keyspaces are rebuilt.

  2. Submit the rebuild CassandraTask custom resource definition to Data Plane Kubernetes cluster where the datacenter is deployed with this command:

    kubectl apply -f rebuild-dc1.cassandratask.yaml

    Submit the rebuild CassandraTask object to the Kubernetes cluster where the specified datacenter is deployed.

    If nodetool rebuild is interrupted before completion, restart it by re-entering the command. The process resumes from the point at which it was interrupted.

    The DC-level operators perform a rolling rebuild operation, one node at a time. The order is determined lexicographically (aka Dictionary order), starting with rack names and then continuing with node (pod) names.

  3. Monitor the rebuild operation progress with this kubectl command:

    kubectl get cassandratask rebuild-dc1 | yq .status

    Sample output:

    ...
    status:
      completionTime: "2022-10-23T23:34:38Z"
      conditions:
      - lastTransitionTime: "2022-10-23T23:34:08Z"
        status: "True"
        type: Running
      - lastTransitionTime: "2022-10-23T23:34:39Z"
        status: "True"
        type: Complete
      startTime: "2022-10-23T23:34:08Z"
      succeeded: 3

    The DC-level operators set the startTime field prior to starting the rebuild operation. They update the completionTime field when the rebuild operation is completed.

    The sample output indicates that the task is completed with the type: Complete status condition set to True. The succeeded: 3 field indicates that three (3) nodes (or pods) completed the requested task successfully. A failed field tracks a running count of pods that failed the rebuild operation.

  4. Monitor the DSE node logs on the Data Plane cluster to verify that the datacenter is rebuilt with this command:

    kubectl logs -f demo-dc1-rack3-sts-1 -c server-system-logger | grep "finished rebuild"

    Sample output:

    INFO  [pool-18-thread-1] 2022-10-23 23:32:18,088  StorageService.java:1895 - finished rebuild for (All keyspaces), (All tokens), 1 streaming connections, NORMAL,  included DCs: dc1 after 5 seconds receiving 3.52 MiB.
Cleanup Replacing a node

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage