Restoring from a snapshot

Restoring a keyspace from a snapshot requires all snapshot files for the table, and if using incremental backups, any incremental backup files created after the snapshot was taken. Streamed SSTables (from repair, decommission, and so on) are also hard-linked and included.

Restoring from snapshots and incremental backups temporarily causes intensive CPU and I/O activity on the node being restored.

Restoring from local nodes

This method copies the SSTables from the snapshots directory into the correct data directories.

  1. Make sure the table schema exists and is the same as when the snapshot was created.

    The nodetool snapshot command creates a table schema in the output directory. If the table does not exist, recreate it using the schema.cql file.

  2. If necessary, truncate the table.

    You may not need to truncate under certain conditions. For example, if a node lost a disk, you might restart before restoring so that the node continues to receive new writes before starting the restore procedure.

    Truncating is usually necessary. For example, if there was an accidental deletion of data, the tombstone from that delete has a later write timestamp than the data in the snapshot. If you restore without truncating (removing the tombstone), the database continues to shadow the restored data. This behavior also occurs for other types of overwrites and causes the same problem.

  3. Locate the most recent snapshot folder. For example:

    /var/lib/cassandra/data/keyspace_name/table_name-UUID/snapshots/snapshot_name

  4. Copy the most recent snapshot SSTable directory to the /var/lib/cassandra/data/keyspace/table_name-UUID directory.

    For all installations, the default location of the data directory is /var/lib/cassandra/data.

  5. Run nodetool refresh.

Restoring from centralized backups

This method uses sstableloader to restore snapshots.

  1. Verify that the SSTable version is compatible with the current version of DataStax Enterprise (DSE):

    1. Locate the version in the file names.

      Use the version number in the SSTable file name to determine compatibility and upgrade requirements. The first two letters of the file name is the version, where the first letter indicates a major version and the second letter indicates a minor version. For example, the following SSTable version is mc:

      data/cycling/cyclist_expenses-2d955621194c11e7a38d9504a063a84e/mc-6-big-Data.db
    2. Using the correct DSE version of sstableupgrade, create a compatible version:

      For SSTable compatibility and upgrading, see SSTable compatibility.

  2. Make sure the table schema exists and is the same as when the snapshot was created.

    The nodetool snapshot command creates a table schema in the output directory. If the table does not exist, recreate it using the schema.cql file.

  3. If necessary, truncate the table.

    You may not need to truncate under certain conditions. For example, if a node lost a disk, you might restart before restoring so that the node continues to receive new writes before starting the restore procedure.

    Truncating is usually necessary. For example, if there was an accidental deletion of data, the tombstone from that delete has a later write timestamp than the data in the snapshot. If you restore without truncating (removing the tombstone), the database continues to shadow the restored data. This behavior also occurs for other types of overwrites and causes the same problem.

  4. Restore the most recent snapshot using the sstableloader tool on the backed-up SSTables.

    The sstableloader streams the SSTables to the correct nodes. You do not need to remove the commitlogs or drain or restart the nodes.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com