nodetool snapshot

Backs up data and table schemas.

data

For all installations, the default location of the data directory is:
  • /var/lib/cassandra/data

Backs up data and table schemas.

Warning: Always run nodetool cleanup before taking a snapshot for restore. Otherwise invalid replicas, that is replicas that have been superseded by new, valid replicas on newly added nodes can get copied to the target when they should not. This results in old data showing up on the target.

Synopsis

install_location/bin/nodetool options snapshot
     ( -cf table_name | --column-family table_name )
     (-kc ktlist | --kc.list ktlist | -kt ktlist | --kt-list ktlist)
     (-sf | --skip-flush) 
     (-t tag | --tag tag )
     -- ( keyspace_name  |  keyspace_name ... )
Table 1. Options
Short Long Description
-h --host Hostname or IP address.
-p --port Port number.
-pwf --password-file Password file path.
-pw --password Password.
-u --username Remote JMX agent username.
-cf table --column-family table Name of the table to snapshot. You must specify one and only one keyspace.
--table table Name of the table to snapshot. You must specify one and only one keyspace.
-kc ktlist --kc.list ktlist Comma separated list of keyspace_name.table_name with NO spaces.

For example, -kc cycling.cyclist,basketball.players

-kt ktlist --kt-list ktlist Comma separated list of keyspace_name.table_name with NO spaces.

For example, -kc cycling.cyclist,basketball.players

-sf --skip-flush Executes the snapshot without flushing the tables first.
-t --tag Name for the snapshot directory installation_path/data/keyspace_name/table-UID/snapshots/snapshot_name
Note: When not specified, the current time is used. For example, 1489076973698.
keyspace One or more optional keyspace names, separated by a space. Default: all keyspaces
-- Separates an option from an argument that could be mistaken for a option.

Description

A snapshot first flushes all in-memory writes to disk, then makes a hard link of the SSTable files for each keyspace. You must have enough free disk space on the node to accommodate making snapshots of your data files. A single snapshot requires little disk space. However, snapshots can cause your disk usage to grow more quickly over time because a snapshot prevents old obsolete data files from being deleted. After the snapshot is complete, you can move the backup files to another location if needed, or you can leave them in place.
Note: Restoring from a snapshot requires the table schema.

The snapshot directory path is: data/keyspace_name/table-UID/snapshots/snapshot_name. Data is backed up into multiple .db files and table schema is saved to schema.cql.

Note: Before upgrading, the DataStax Distribution of Apache Cassandra (DDAC) backs up all keyspaces. See taking a snapshot.

Example: All keyspaces

Take a snapshot of all keyspaces on the node:

nodetool snapshot

A message displays with the name of the snapshot directory:

Requested creating snapshot(s) for [all keyspaces] with snapshot name [1489076973698] and options {skipFlush=false}
Snapshot directory: 1489076973698

Example: Single keyspace snapshot

Assuming you created the keyspace cycling, took a snapshot of the keyspace and named the snapshot cycling_2017-3-9.:

nodetool snapshot -t cycling_2017-3-9 cycling

The following output appears:

Requested creating snapshot(s) for [cycling] with snapshot name [2015.07.17]
Snapshot directory: cycling_2017-3-9
Assuming the cycling keyspace contains two tables, cyclist_name and upcoming_calendar, taking a snapshot of the keyspace creates multiple snapshot directories named cycling_2017-3-9. A number of .db files containing the data are located in these directories as well table schema. For example, from the installation directory:
ls -1 data/cycling/cyclist_name-9e516080f30811e689e40725f37c761d/snapshots/cycling_2017-3-9
manifest.json
mc-1-big-CompressionInfo.db
mc-1-big-Data.db
mc-1-big-Digest.crc32
mc-1-big-Filter.db
mc-1-big-Index.db
mc-1-big-Statistics.db
mc-1-big-Summary.db
mc-1-big-TOC.txt
schema.cql

Example: Multiple keyspaces snapshot

Assuming you created a keyspace named mykeyspace in addition to the cycling keyspace, take a snapshot of both keyspaces.

nodetool snapshot mykeyspace cycling

The following message appears:

Requested creating snapshot(s) for [mykeyspace, cycling] with snapshot name [1391460334889]
Snapshot directory: 1391460334889

Example: Single table snapshot

Take a snapshot of only the cyclist_name table in the cycling keyspace.

nodetool snapshot --table cyclist_name cycling
Requested creating snapshot(s) for [cycling] with snapshot name [1391461910600]
Snapshot directory: 1391461910600
Cassandra creates the snapshot directory named 1391461910600 that contains data files and the schema of cyclist_name table in data/cycling/cyclist_name-a882dca02aaf11e58c7b8b496c707234/snapshots.

Example: Multiple tables in different keyspaces

Take a snapshot of several tables in different keyspaces, such as the cyclist_name table in the cycling keyspace and the sample_times table in the test keyspace. List tables in a comma separate list with no spaces.

nodetool snapshot -kt cycling.cyclist_name,test.sample_times
Requested creating snapshot(s) for [cycling.cyclist_name,test.sample_times] with snapshot name [1431045288401]
Snapshot directory: 1431045288401