Backup Service overview
The Backup Service allows backing up and restoring DSE cluster data.
cluster_name.conf
The location of the cluster_name.conf file depends on the type of installation:- Package installations: /etc/opscenter/clusters/cluster_name.conf
- Tarball installations: install_location/conf/clusters/cluster_name.conf
Use OpsCenter to schedule and manage backups, and restore from those backups, across all registered DataStax Enterprise clusters. The Backup Service:
- Performs all functions using the REST API or visually through the OpsCenter UI
- Delivers smart backups that always ensure full data protection, including backups of commit logs
- Backs up data to a local server (On Server), Amazon S3, or a custom location on the local filesystem
- Compresses backup files to save storage
- Allows specifying retention policies on scheduled backups
- Easily lets admins carry out full, table-level, or point-in-time restores for a cluster
- Notifies operations staff should backup or restore operations fail
- Supports cloning data between clusters (such as copy data from a production cluster to a development cluster) or from a defined other location (Amazon S3 or Local FS)
- Provides detailed backup and restore reports and history
A backup is a snapshot of all on-disk data files (SSTable files) stored in the data directory. Backups are stored locally on each node (On Server), and you can specify additional locations such as a local filesystem or in cloud backup services like Amazon S3 where the snapshot data is copied.
- Restoring a snapshot that contains only the system keyspace is not allowed. There must be both system and non-system keyspaces, or only non-system keyspaces in the snapshot you want to restore.
- Restoring a snapshot that does not contain a table definition is not allowed.
- Restoring from a backup while Kerberos is enabled is not currently supported by OpsCenter.
- Restoring a snapshot to a location with insufficient disk space fails. The Restore Report indicates which nodes do not have sufficient space and how much space is necessary for a successful restore. For more information and tips for preventative measures, see Monitoring sufficient disk space for restoring backups.
If a cluster includes DSE Search or DSE Analytics nodes, a backup job that includes keyspaces with DSE Search data or Analytics nodes will save the Search and Analytics data. Any Solr indexes are recreated upon restore.
OpsCenter intelligently stores the backup data to prevent duplication of files. A backup first flushes all in-memory writes to disk, then makes a hard link of the SSTable files for each keyspace. Unlike traditional backup systems that use full backups and then incremental backups with deltas based on the last full backup, the OpsCenter approach allows you to fully recreate the state of the database at the time of each backup without duplicating files. If you have configured an additional Local FS or S3 location, OpsCenter creates a manifest for each backup that contains a list of the SSTables in that backup, and only uploads new SSTable files.
You can schedule backups to run automatically on a recurring interval, or manually run one-off backups on a scheduled or ad hoc basis.
Backing up data using OpsCenter
The Backup Service provides a simple interface for scheduling regular or one-off backups of all or specific keyspaces in a DataStax Enterprise (DSE) cluster, and for recovering data from the stored backups.
The Backup Service was designed to manage enterprise-wide backup and restore operations for DSE clusters. While some administrators and operations staff believe that backups are not needed because of powerful and flexible replication capabilities in DSE, proper backup and restore procedures are still very important to implement for production clusters.
While replication does provide for copies of data to exist in multiple locations, datacenters, and cloud availability zones, all operations performed in a cluster are replicated, including operations that result in lost or incorrect data. For example, if a table is mistakenly dropped, if data is accidentally deleted, or if cluster data becomes corrupted, those adverse events will be replicated to all other copies of that data. In such cases, there is no way to recover the lost or uncorrupted data without a backup of the data. The Backup Service provides a simple interface for scheduling regular or one-off backups of all or specific keyspaces in a cluster, and for recovering data from the stored backups.
Commit log backups for point-in-time restores
In addition to keyspace backups, commit log backups are also available in the backup service to facilitate point-in-time restores for finer-grained control of the backup data. Point-in-time restores are available after enabling commit log backups in conjunction with keyspace backups. Similar to keyspace backups, the commit log archives are retained based on a configurable retention policy.
Backup retention policies
Each scheduled backup has a retention policy that defines how OpsCenter handles the files for older backup data. The default policy is to retain On Server backup files for 30 days. Amazon S3 and Local FS default retention policy is to Retain all. For each scheduled backup task and configured location, you can set a configurable time period for which to retain the snapshot data. OpsCenter supports minutes, hours, days, and weeks for the retention time period. For example, you can define a retention policy that removes snapshot data older than 30 days, or 26 weeks, or 3 hours. If you want to keep all backups, OpsCenter has a Retain All policy that retains the backup files indefinitely.
When a backup that was configured with a time-limited retention policy completes, OpsCenter scans the snapshot data for outdated files that do not belong to other snapshots and removes them at the next scheduled backup.
For example, a scheduled backup sends data to S3, runs weekly, and has a retention policy of removing backups older than 3 days. The layout in the S3 bucket is as follows:
mybucket/ snapshots/ node-id1/ sstables/ MyKeyspace-MyTable-ic-4-Data.db MyKeyspace-MyTable-ic-5-Data.db MyKeyspace-MyTable-ic-6-Data.db MyKeyspace-MyTable-ic-7-Data.db ... 1234-ABCD-2015-01-25-01-00/ backup.json #includes 4-Data and 5-Data MyKeyspace/schema.json 1234-ABCD-2015-02-01-01-00/ backup.json #includes 5,6,7-Data MyKeyspace/schema.json
After the February 1 backup completes, OpsCenter scans the SSTables for outdated files according to the retention policy. The January 25 backup files can be removed by OpsCenter. Because MyKeyspace-MyTable-ic-4-Data.db was in the January 25 backup but not in the February 1 backup, it will be removed. Even though MyKeyspace-MyTable-ic-5-Data.db was in the January 25 backup, it is also in the latest backup, so it will be retained until it meets its defined retention policy.
Backing up to Amazon S3
When adding an Amazon S3 bucket location as an additional location for storing backup snapshots, the DataStax Agent sends the snapshot files to the S3 bucket automatically. All SSTables for a particular node and table are only stored once in Amazon S3 to optimize storage space.
The backup files are stored in S3 in the following hierarchy:
mybucket/ snapshots/ node-id1/ sstables/ MyKeyspace-MyTable-ic-5-Data.db ... MyKeyspace-MyTable-ic-5-TOC.txt MyKeyspace-MyTable-ic-6-Data.db ... 1234-ABCD-2014-10-01-01-00/ backup.json MyKeyspace/schema.json 1234-ABCD-2014-09-30-01-00/ backup.json MyKeyspace/schema.json node-id2/ sstables/ MyKeyspace-MyTable-ic-1-Data.db ... MyKeyspace-MyTable-ic-2-Data.db ... 1234-ABCD-2014-10-01-01-00/ backup.json MyKeyspace/schema.json 1234-ABCD-2014-09-30-01-00/ backup.json MyKeyspace/schema.json commitlogs/ node1/ 1435432324_Commitlog-3-1432320421.log 1435433232_Commitlog-3-1432320422.log ...