Backup Service Overview

The Backup Service allows you to backup and restore your cluster data.

Using OpsCenter, you can schedule and manage backups, and restore from those backups, across all registered DataStax Enterprise clusters. The Backup Service:

  • Performs all functions using the REST API or visually through the OpsCenter UI
  • Delivers smart backups that always ensure full data protection, including backups of commit logs
  • Backs up data to a local server, Amazon S3, or a custom location on the local filesystem
  • Compresses backup files to save storage
  • Allows specifying retention policies on backups
  • Easily lets admins carry out full, table-level, or point-in-time restores for a cluster
  • Notifies operations staff should backup or restore operations fail
  • Supports cloning database clusters (e.g., copy a production cluster to a development cluster)
  • Provides detailed backup and restore reports

A backup is a snapshot of all on-disk data files (SSTable files) stored in the data directory. Backups are stored locally on each node, and you can specify additional locations in cloud backup services like Amazon S3 where the snapshot data will be copied. Backups can be taken per keyspace, for selected multiple keyspaces, or for all keyspaces in the cluster while the system is online.

If your cluster includes DSE Search or DSE Analytics nodes, a backup job that includes keyspaces with DSE Search data or the cfs keyspace for Analytics nodes will save the Search and Analytics data. Any Solr indexes will be recreated on restore.

OpsCenter intelligently stores the backup data to prevent duplication of files. A backup first flushes all in-memory writes to disk, then makes a hard link of the SSTable files for each keyspace. Unlike traditional backup systems that use full backups and then incremental backups with deltas based on the last full backup, this allows you to fully recreate the state of the database at the time of each backup without duplicating files. If you have configured an additional S3 location, OpsCenter creates a manifest for each backup that contains a list of the SSTables in that backup, and only uploads new SSTable files.

You can schedule backups to run automatically, or manually run one-off backups on a scheduled or ad hoc basis.

There must be enough free disk space on the node to accommodate making snapshots of your data files. A single snapshot requires little disk space. However, snapshots will cause your disk usage to grow more quickly over time because a snapshot prevents obsolete data files from being deleted. You can specify how long the snapshot data should be retained by setting a retention policy for each backup location.

Note: OpsCenter Data Backups do not show or manage manual snapshots taken using the nodetool snapshot command.

In addition to keyspace backups, commitlog backups are also available in the backup service to allow point-in-time restores for finer-grained control of the backed up data. Point-in-time restores are available when you enable commitlog backups in conjunction with keyspace backups. Like keyspace backups, the commitlogs will be retained based on a configurable retention policy.

Note: Point-in-time restores are only supported if the cluster topology is unchanged since the time you want to restore.

Backing up to Amazon S3 

When you add an S3 bucket as an additional location for storing backup snapshots, the agent will send the snapshot files to the S3 bucket automatically. All SSTables for a particular node and table will only be stored once in S3 to optimize storage space.
Important: The Backup Service requires control over the data and structure of its destination locations. The AWS S3 bucket and the Local file system destinations must be dedicated for use only by OpsCenter. Any additional directories or files in those destinations can prevent the Backup Service from properly conducting a Backup or Restore operation.

The backup files are stored in S3 in the following hierarchy:

  mybucket/
    snapshots/
      node-id1/
        sstables/
          MyKeyspace-MyTable-ic-5-Data.db
          ...
          MyKeyspace-MyTable-ic-5-TOC.txt
          MyKeyspace-MyTable-ic-6-Data.db
          ...
        1234-ABCD-2014-10-01-01-00/
          backup.json
          MyKeyspace/schema.json
        1234-ABCD-2014-09-30-01-00/
          backup.json
          MyKeyspace/schema.json
       node-id2/
         sstables/
           MyKeyspace-MyTable-ic-1-Data.db
           ...
           MyKeyspace-MyTable-ic-2-Data.db
           ...
         1234-ABCD-2014-10-01-01-00/
           backup.json
           MyKeyspace/schema.json
         1234-ABCD-2014-09-30-01-00/
           backup.json
           MyKeyspace/schema.json
   commitlogs/
     node1/
       1435432324_Commitlog-3-1432320421.log
       1435433232_Commitlog-3-1432320422.log
       ...
   

The backup.json file contains metadata about which of the backed up SSTables are included in that backup.

If OpsCenter encounters an error when backing up to S3, it will retry the backup a user-configurable number of times (3 by default) unless it encounters an unrecoverable error such as invalid AWS credentials.

The AWS credentials and bucket names are stored in cluster_name.conf. Be sure to use proper security precautions to ensure that this file isn't readable by unauthorized users.

Backup retention policies 

Each scheduled backup has a retention policy that defines how OpsCenter handles the files for older backup data. The default policy is to retain backup files for 30 days. For each backup task, you can set a configurable time period in which to retain the snapshot data. OpsCenter supports minutes, hours, days, and weeks for the retention time period. For example, you can define a retention policy that removes snapshot data older than 30 days, or 26 weeks, or 3 hours. If you want to keep all backups, OpsCenter has a Retain All policy that will retain the backup files indefinitely.

When a backup that was configured with a time-limited retention policy completes, OpsCenter scans the snapshot data for outdated files that do not belong to other snapshots and removes them at the next scheduled backup.

For example, a user configured a scheduled backup that sends the data to S3, runs every week, and has a retention policy of removing backups older than 3 days. The layout in the S3 bucket is this:

mybucket/
  snapshots/
    node-id1/
      sstables/
        MyKeyspace-MyTable-ic-4-Data.db
        MyKeyspace-MyTable-ic-5-Data.db
        MyKeyspace-MyTable-ic-6-Data.db
        MyKeyspace-MyTable-ic-7-Data.db
        ...
      1234-ABCD-2015-01-25-01-00/
        backup.json #includes 4-Data and 5-Data
        MyKeyspace/schema.json
      1234-ABCD-2015-02-01-01-00/
        backup.json #includes 5,6,7-Data
        MyKeyspace/schema.json
   

After the February 1 backup completes, OpsCenter scans the SSTables for outdated files according to the retention policy. The January 25 backup files can be removed. Because MyKeyspace-MyTable-ic-4-Data.db was in the January 25 backup but not in the February 1 backup, it will be removed. Even though MyKeyspace-MyTable-ic-5-Data.db was in the January 25 backup, it is also in the latest backup, so it will be retained.

Commitlog backups 

Commitlog backups allow you to perform point-in-time restores, where you can specify a particular date and time from which to restore the data. Commitlog backups are configured separately from snapshot backups.

cluster_name.conf 

The location of the cluster_name.conf file depends on the type of installation:

  • Package installations: /etc/opscenter/clusters/cluster_name.conf
  • Tarball installations: install_location/conf/clusters/cluster_name.conf