Restore from a snapshot

Restoring a keyspace from a snapshot requires all snapshot files for the table, and if using incremental backups, any incremental backup files created after the snapshot was taken. Streamed SSTables (from repair, decommission, and so on) are also hardlinked and included.

Restoring from snapshots and incremental backups temporarily causes intensive CPU and I/O activity on the node being restored.

Restoring from local nodes

This method copies the SSTables from the snapshots directory into the correct data directories.

Make sure the table schema exists and is the same as when the snapshot was created.

The nodetool snapshot command creates a table schema in the resulting snapshot directory. If the table doesn’t exist, recreate it using the schema.cql file from the snapshot.

A snapshot is a hardlink to the SSTable files in the data directory for a schema table at the moment the snapshot is executed. Data is backed up into multiple .db files and table schema is saved to schema.cql.

If necessary, TRUNCATE the target table.

Truncating is usually necessary. For example, if there was an accidental deletion of data, the tombstone from that delete has a later write timestamp than the data in the snapshot. If you restore without truncating (removing the tombstone), the database continues to shadow the restored data. This behavior also occurs for other types of overwrites and causes the same problem.

You might not need to truncate the table under certain conditions. For example, if a node lost a disk, you might restart before restoring so that the node continues to receive new writes before starting the restore procedure.

The path follows the pattern /var/lib/cassandra/data/<keyspace_name>/<table_name>-<UUID>/snapshots/<snapshot_name>.

Run nodetool import, specifying the data directory that contains the snapshot of the SSTable files.

Restoring from centralized backups

This method uses sstableloader to restore snapshots.

Verify that the SSTable version is compatible with the current version of DSE.
1. Locate the version in the file names.
  
  Use the version number and format in the SSTable file name to determine compatibility and upgrade requirements. The first two letters of the file name represent the version. The first letter indicates a major version, and the second letter indicates a minor version. For example, the following SSTable version is aa and the format is bti:
  data/cycling/cyclist_expenses-e4f31e122bc511e8891b23da85222d3d/aa-1-bti-Data.db
2. Use the correct DSE version of sstableupgrade to create a compatible version.
  
  For details on SSTable versions and compatibility, see DSE product compatibility.
Make sure the table schema exists and is the same as when the snapshot was created.

The nodetool snapshot command creates a table schema in the resulting snapshot directory. If the table doesn’t exist, recreate it using the schema.cql file from the snapshot.
If necessary, TRUNCATE the target table.

Truncating is usually necessary. For example, if there was an accidental deletion of data, the tombstone from that delete has a later write timestamp than the data in the snapshot. If you restore without truncating (removing the tombstone), the database continues to shadow the restored data. This behavior also occurs for other types of overwrites and causes the same problem.

You might not need to truncate the table under certain conditions. For example, if a node lost a disk, you might restart before restoring so that the node continues to receive new writes before starting the restore procedure.
Restore the most recent snapshot using the sstableloader tool on the backed-up SSTables.

The sstableloader streams the SSTables to the correct nodes. You do not need to remove the commitlogs, or to drain or restart the nodes.

Restore from a snapshot

Restoring from local nodes

Restoring from centralized backups

Was this helpful?

Give Feedback