nodetool snapshot

Creates a backup by taking a snapshot of table data.

data

For all installations, the default location of the data directory is:
  • /var/lib/cassandra/data

Creates a backup by taking a snapshot of table data. A snapshot is a hardlink to the SSTable files in the data directory for a schema table at the moment the snapshot is executed. For more, see Taking a snapshot.

The snapshot directory path is: data/keyspace_name/table-UUID/snapshots/snapshot_name. Data is backed up into multiple .db files and table schema is saved to schema.cql. The schema.cql file captures the structure of the table at the time of snapshot. Restoring the snapshot requires the table to have the same structure. See this DataStax Support knowledge base article Manual Backup and Restore, with Point-in-time and table-level restore.

Warning: Always run nodetool cleanup before taking a snapshot for restore. Otherwise invalid replicas, that is replicas that have been superseded by new, valid replicas on newly added nodes can get copied to the target when they should not. This results in old data showing up on the target.
Note: Before upgrading DataStax Enterprise, be sure to create a backup of all keyspaces. See taking a snapshot.

Synopsis

nodetool [connection_options] snapshot 
[--table table_name | -kt keyspace_name.table_name,...] 
[-sf] [-t snapshotname] [--] 
[keyspace_name [keyspace_name...]]
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a valid option or user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key:value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

Definition

The short form and long form parameters are comma-separated.

Connection options

-h, --host hostname
The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.
-p, --port jmx_port
The JMX port number.
-pw, --password jmxpassword
The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.
-pwf, --password-file jmx_password_filepath
The filepath to the file that stores JMX authentication credentials.
-u, --username jmx_username
The username for authenticating with secure JMX.

Command arguments

--
Separates an option from an argument that could be mistaken for a option.
--table, -cf, --column-family table_name
Table name in the specified keyspace.
-kt, --kt-list, -kc, --kc.list keyspace_name.table_name,...
Comma-separated list of keyspace_name.table_name with no spaces after the comma.

Example: cycling.cyclist,basketball.players

-sf, --skip_flush
Do not flush tables before creating the snapshot.
CAUTION: Snapshot will not contain unflushed data.
-t snapshotname, --tag snapshotname
The snapshot filepath. When not specified, the current time is used for the directory name. For example, 1489076973698.

Examples

Take snapshot of all keyspaces on the node

Run nodetool cleanup before taking the snapshot:
nodetool cleanup
Take a snapshot of all keyspaces:
nodetool snapshot

Results include the name of the snapshot directory:

Requested creating snapshot(s) for [all keyspaces] with snapshot name [1489076973698] and options {skipFlush=false}
Snapshot directory: 1489076973698

Create tagged snapshot of keyspace

Run nodetool cleanup before taking the snapshot:
nodetool cleanup
Take tagged snapshot of cycling keyspace in the cycling_2017-3-9 directory:
nodetool snapshot -t cycling_2017-3-9 cycling

Results:

Requested creating snapshot(s) for [cycling] with snapshot name [2015.07.17] and options {skipFlush=false}
Snapshot directory: cycling_2017-3-9

The cycling keyspace contains two tables, cyclist_name and upcoming_calendar. The snapshot creates multiple snapshot directories named cycling_2017-3-9. A number of .db files containing the data are located in these directories, along with table schema.

For example, from the DSE installation directory:

ls -1 data/cycling/cyclist_name-9e516080f30811e689e40725f37c761d/snapshots/cycling_2017-3-9
manifest.json
mc-1-big-CompressionInfo.db
mc-1-big-Data.db
mc-1-big-Digest.crc32
mc-1-big-Filter.db
mc-1-big-Index.db
mc-1-big-Statistics.db
mc-1-big-Summary.db
mc-1-big-TOC.txt
schema.cql

Take snapshot of multiple keyspaces

Run nodetool cleanup before taking the snapshot:
nodetool cleanup
Take snapshot of the mykeyspace and cycling keyspaces:
nodetool snapshot mykeyspace cycling
Results:
Requested creating snapshot(s) for [mykeyspace, cycling] with snapshot name [1391460334889] and options {skipFlush=false}
Snapshot directory: 1391460334889

Take snapshot of single table

Run nodetool cleanup before taking the snapshot:
nodetool cleanup

Take a snapshot of the cyclist_name table in the cycling keyspace:

nodetool snapshot --table cyclist_name cycling
Results:
Requested creating snapshot(s) for [cycling] with snapshot name [1391461910600] and options {skipFlush=false}
Snapshot directory: 1391461910600

The resulting snapshot directory 1391461910600 contains data files and the schema of cyclist_name table in data/cycling/cyclist_name-a882dca02aaf11e58c7b8b496c707234/snapshots.

Take snapshot of multiple tables in different keyspaces

Run nodetool cleanup before taking the snapshot:
nodetool cleanup

Take a snapshot the cyclist_name table in the cycling keyspace and the sample_times table in the test keyspace:

nodetool snapshot -kt cycling.cyclist_name,test.sample_times
Results:
Requested creating snapshot(s) for [cycling.cyclist_name,test.sample_times] with snapshot name [1431045288401] and options {skipFlush=false}
Snapshot directory: 1431045288401

Create snapshot of keyspace without flushing tables

Run nodetool cleanup before taking the snapshot:
nodetool cleanup
Take snapshot of cycling keyspace with -sf option:
nodetool snapshot cycling -sf

Results:

Requested creating snapshot(s) for [cycling] with snapshot name [1431045288401] and options {skipFlush=true}
Snapshot directory: 1431045288401