nodetool snapshot

Creates a backup by taking a snapshot of table data.

Data directory

  • For all installations, the default location of the data directory is /var/lib/cassandra/data.

Creates a backup by taking a snapshot of table data. A snapshot is a hardlink to the SSTable files in the data directory for a schema table at the moment the snapshot is executed.

The snapshot directory path is: data/keyspace_name/table-UID/snapshots/snapshot_name. Data is backed up into multiple .db files and table schema is saved to schema.cql. The schema.cql file captures the structure of the table at the time of snapshot because restoring the snapshot requires the table to have the same structure. See this DataStax Support knowledge base article Manual Backup and Restore, with Point-in-time and table-level restore.

Warning: Always run nodetool cleanup before taking a snapshot for restore. Otherwise invalid replicas, that is replicas that have been superseded by new, valid replicas on newly added nodes can get copied to the target when they should not. This results in old data showing up on the target.
Note: Before upgrading DataStax Enterprise, be sure to create a back up of all keyspaces. See taking a snapshot.

Synopsis

nodetool [connection_options] snapshot 
[--table table_name | -kt keyspace_name.table_name,...] 
[-sf] [-t snapshotname] [--] 
[keyspace_name [keyspace_name...]]
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a valid option or user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key:value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

Definition

The short form and long form parameters are comma-separated.

Connection options

-h, --host hostname
The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.
-p, --port jmx_port
The JMX port number.
-pw, --password jmxpassword
The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.
-pwf, --password-file jmx_password_filepath
The filepath to the file that stores JMX authentication credentials.
-u, --username jmx_username
The user name for authenticating with secure JMX.

Command arguments

--
Separates an option from an argument that could be mistaken for a option.
--table, -cf, --column-family table_name
Table name in the specified keyspace.
-kt, --kt-list, -kc, --kc.list keyspace_name.table_name,...
Comma-separated list of keyspace_name.table_name with no spaces after the comma. For example,

cycling.cyclist,basketball.players

-sf, --skip_flush
Do not flush tables before creating the snapshot.
CAUTION: Snapshot will not contain unflushed data.
-t snapshotname, --tag snapshotname
The snapshot filepath. When not specified, the current time is used for the directory name. For example, 1489076973698.

Examples

Take snapshot of all keyspaces on the node

nodetool snapshot

A message displays with the name of the snapshot directory:

Requested creating snapshot(s) for [all keyspaces] with snapshot name [1489076973698] and options {skipFlush=false}
Snapshot directory: 1489076973698

Create snapshot of single keyspace in the cycling_2017-3-9 filepath

nodetool snapshot -t cycling_2017-3-9 cycling

The following output appears:

Requested creating snapshot(s) for [cycling] with snapshot name [2015.07.17]
Snapshot directory: cycling_2017-3-9

Take snapshot of single keyspace with two tables

The cycling keyspace contains two tables, cyclist_name and upcoming_calendar. The snapshot creates multiple snapshot directories named cycling_2017-3-9. A number of .db files containing the data are located in these directories, along with table schema. For example, from the DSE installation directory:
ls -1 data/cycling/cyclist_name-9e516080f30811e689e40725f37c761d/snapshots/cycling_2017-3-9
manifest.json
mc-1-big-CompressionInfo.db
mc-1-big-Data.db
mc-1-big-Digest.crc32
mc-1-big-Filter.db
mc-1-big-Index.db
mc-1-big-Statistics.db
mc-1-big-Summary.db
mc-1-big-TOC.txt
schema.cql

Take snapshot of multiple (mykeyspace and cycling) keyspaces

nodetool snapshot mykeyspace cycling
Requested creating snapshot(s) for [mykeyspace, cycling] with snapshot name [1391460334889]
Snapshot directory: 1391460334889

Take snapshot of single table

Take a snapshot of only the cyclist_name table in the cycling keyspace.

nodetool snapshot --table cyclist_name cycling
Requested creating snapshot(s) for [cycling] with snapshot name [1391461910600]
Snapshot directory: 1391461910600

The resulting snapshot directory 1391461910600 contains data files and the schema of cyclist_name table in data/cycling/cyclist_name-a882dca02aaf11e58c7b8b496c707234/snapshots.

Take snapshot of multiple tables in different keyspaces

Take a snapshot the cyclist_name table in the cycling keyspace and the sample_times table in the test keyspace. For the -kt command argument, list tables in a comma-separated list with no spaces.

nodetool snapshot -kt cycling.cyclist_name,test.sample_times
Requested creating snapshot(s) for [cycling.cyclist_name,test.sample_times] with snapshot name [1431045288401]
Snapshot directory: 1431045288401