Restoring a backup to a specific point-in-time
A point-in-time restore uses commit log archives to restore data from a backup to a specific date and time using OpsCenter.
For a point-in-time restore, OpsCenter intelligently chooses which snapshots and commit logs to restore from based on the date and time you are restoring the cluster to. If an acceptable combination of snapshots and commit logs cannot be found, the restore fails. A detailed error message is visible in the Activity section of the OpsCenter UI.
dse-env.sh
The default location of the dse-env.sh file depends on the type of installation:
Package installations |
/etc/dse/dse-env.sh |
Tarball installations |
installation_location/bin/dse-env.sh |
Prerequisites
- For point-in-time restores to work, you must have enabled commit log backups and performed at least one snapshot backup before the time to which you are restoring.
- The Restore feature of the Backup Service leverages the sstableloader utility,
which currently requires enabling the thrift server on all nodes before
restoring. Before restoring, ensure the thrift server is enabled on all
nodes.Note: The thrift server is only required for DSE versions earlier than 5.0 (DSE 4.8.x versions).
- When performing a point-in-time restore, the cluster topology must not have changed since the backup. Attempting to perform a point-in-time restore on a cluster whose topology has changed results in a failure. DataStax strongly recommends performing a snapshot backup both before and after changing the cluster topology. After changing the topology, you can then restore the cluster based on that backup. If reverting to the previous topology, you can use the backup with the original topology to restore the cluster.
- Known limitations:
- Point-in-time restore cannot restore commit logs for keyspaces or tables that would have to be recreated in Cassandra 2.1 and later, and DataStax Enterprise 4.7 and later.
- Point-in-time restore fails if any tables were recreated during the time period of the actual point-in-time restore.
- Restoring a snapshot that contains only the system keyspace is not allowed. There must be both system and non-system keyspaces, or only non-system keyspaces in the snapshot you want to restore.
- Restoring a snapshot that does not contain a table definition is not allowed.
- Restoring from a backup while Kerberos is enabled is not currently supported by OpsCenter.
- Restoring a snapshot to a location with insufficient disk space fails. The Restore Report indicates which nodes do not have sufficient space and how much space is necessary for a successful restore. For more information and tips for preventative measures, see Monitoring sufficient disk space for restoring backups.
Procedure
- Click .
- Click the Details link for the Backup Service.
-
Click Restore Backup.
The Restore from Backup, Step 1 of 2: Select Backup dialog appears.
-
Click the Point In Time tab.
-
Complete your selections:
-
Complete your selections:
-
Click Restore Backup.
The Confirm Restore dialog appears.Warning: If a value was not set for throttling stream output, a warning message indicates the consequences of unthrottled restores. Take one of the following actions:
- Click Cancel and set the throttle value in the Restore from Backup dialog.
- Set the
stream_throughput_outbound_megabits_per_sec
andinter_dc_stream_throughput_outbound_megabits_per_sec
values in cassandra.yaml. - Proceed anyway at the risk of creating network bottlenecks.
Tip: If you are using LCM to manage DSE cluster configuration, update Cluster Communication settings in cassandra.yaml in the config profile for the cluster and run a configuration job. Stream throughput (not inter-dc) is already set to 200 in LCM defaults. - Click Start Restore to confirm when prompted.
Results
OpsCenter retrieves the backup data and sends the data to the nodes in the cluster. A snapshot restore is completed first, following the same process as a normal snapshot restore. After the snapshot restore successfully completes, OpsCenter instructs all agents in parallel to download the necessary commit logs, followed by a rolling commit log replay across the cluster. Each node is configured for replay and restarted after the previous node finishes successfully.
If an error occurs during a point-in-time restore for a subset of tables, you might
need to manually the revert changes made to some cluster nodes. To clean up a node,
edit dse-env.sh and remove the last line that
specifies JVM_OPTS
. For example:
export JVM_OPTS="$JVM_OPTS -Dcassandra.replayList=Keyspace1.Standard1"