Restore a backup of data

The Mission Control package includes Medusa, a tool for restoring Apache Cassandra® and DataStax Enterprise (DSE) data in Kubernetes.

Topics covered:

For a restore to be possible, a MedusaBackup object must exist.

Restore a backup

  • UI

  • CLI

  1. Access Mission Control’s UI.

  2. In the Home Clusters dialog, click the target cluster namespace.

  3. Click the Backups tab.

  4. In the Backup Activity section, click the overflow menu icon (3 dots) on the row of your target datacenter.

  5. Select Restore Backup.

  6. Review the warning in the confirmation dialog, and then click Restore Backup.

To view notifications from the restore activity, see Monitor restore activity status.

  1. Access Mission Control’s CLI.

  2. To restore an existing backup for a Apache Cassandra® or a DataStax Enterprise (DSE) datacenter, create the MedusaRestoreJob Custom Resource (CR) for your release in the same namespace where the MissionControlCluster or cassandraDatacenter resource resides. This example uses cassandraDatacenter:demo-dc1:

    apiVersion: medusa.missioncontrol.datastax.com/v1beta1
    kind: MedusaRestoreJob
    metadata:
      name: restore-backup1
      namespace: project-4vcx
    spec:
      cassandraDatacenter: demo-dc1
      backup: medusa-backup1

    The spec.backup `value should match the `MedusaBackup metadata.name value. As soon as Mission Control detects the MedusaRestoreJob object creation, it orchestrates the shutdown of all Cassandra or DSE pods, and the medusa-restore container performs the data restore upon pod restart. The restore runs concurrently and restores all nodes at the same time.

Monitor restore activity status

  • UI

  • CLI

  1. Access Mission Control’s UI.

  2. In the main navigation, click Activities.

  3. See Status notifications regarding the progress of the restore activity.

    A status of SUCCESS indicates the restore completed without issue. The Start and End columns list the timestamps of the restore activity.

  1. Access Mission Control’s CLI.

  2. Check if the finishTime is set in the MedusaRestoreJob object status.

    kubectl get cassandrarestore/restore-backup1 -o yaml
    Sample results
    apiVersion: medusa.missioncontrol.datastax.com/v1beta1
    kind: MedusaRestoreJob
    metadata:
      name: restore-backup1
    spec:
      backup: medusa-backup1
      cassandraDatacenter: demo-dc1
    status:
      datacenterStopped: "2022-01-06T16:45:09Z"
      finishTime: "2022-01-06T16:48:23Z"
      restoreKey: ec5b35c1-f2fe-4465-a74f-e29aa1d467ff
      restorePrepared: true
      startTime: "2022-01-06T16:44:53Z"

Synchronize MedusaBackup objects with a Medusa storage backend

To restore a backup taken on a different Cassandra or DSE cluster, you must execute a synchronization task to create the corresponding MedusaBackup object locally.

  1. Choose a backup Custom Resource Definition (CRD) that matches your release.

  2. Create a MedusaTask object in the Kubernetes cluster and namespace where the referenced MissionControlCluster or cassandraDatacenter is deployed.

  3. Define a sync operation in the MedusaTask object:

    apiVersion: medusa.missioncontrol.datastax.com/v1beta1
    kind: MedusaTask
    metadata:
      name: sync-backups-1
      namespace: project-4vcx
    spec:
      cassandraDatacenter: demo-dc1
      operation: sync

Reconciliation is triggered by the MedusaTask object creation, executing the following operations:

  • List the backups in the remote storage system.

  • Create backups that are missing locally.

  • Delete locally any backups that are missing in the remote storage system.

As reconciliation completes, the MedusaTask object status is updated with the finish time and the name of the pod that was used to communicate with the storage backend:

...
status:
  finishTime: '2022-07-26T08:15:55Z'
  finished:
    - podName: demo-dc2-default-sts-0
  startTime: '2022-07-26T08:15:54Z'
...

Purge backups

The Medusa resource type has two storageProperties settings to control the retention of backups:

  • maxBackupAge

  • maxBackupCount

These settings are used by the medusa purge operation to determine which backups to delete. See the backup CRDs and their properties for your release.

Mission Control schedules the purge operation on all Cassandra or DSE nodes in the datacenter and applies Medusa’s purge rules on the data per node. Upon purge completion, the MedusaTask object status is updated with the finishTime and the purge statistics of each node:

status:
  finishTime: '2022-07-26T08:42:33Z'
  finished:
    - nbBackupsPurged: 3
      nbObjectsPurged: 814
      podName: demo-dc2-default-sts-1
      totalObjectsWithinGcGrace: 542
      totalPurgedSize: 10770961
    - nbBackupsPurged: 3
      nbObjectsPurged: 852
      podName: demo-dc2-default-sts-2
      totalObjectsWithinGcGrace: 520
      totalPurgedSize: 10787447
    - nbBackupsPurged: 3
      nbObjectsPurged: 808
      podName: demo-dc2-default-sts-0
      totalObjectsWithinGcGrace: 444
      totalPurgedSize: 10903221
  startTime: '2022-07-26T08:37:48Z'

The purge task generates a sync task to delete the purged backups from the Kubernetes storage.

References

See the release-specific list of restore Custom Resource Definitions (CRD) files and properties for:

  • MedusaRestoreJob

  • MedusaTask

Find additional documentation and sources at Medusa GitHub.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com