Taking a snapshot
Steps for taking a global snapshot or per node.
Snapshots are taken per node using the nodetool
snapshot command. To take a global snapshot, run the nodetool
snapshot
command with a parallel ssh utility, such as pssh.
A snapshot first flushes all in-memory
writes to disk, then makes a hard link of the SSTable files for each keyspace. You
must have enough free disk space on the node to accommodate making snapshots of your
data files. A single snapshot requires little disk space. However, snapshots can
cause your disk usage to grow more quickly over time because a snapshot prevents old
obsolete data files from being deleted. After the snapshot is complete, you can move
the backup files to another location if needed, or you can leave them in
place.
Note: Restoring from a snapshot requires the table schema.
Procedure
-
Run nodetool cleanup to ensure that invalid replicas are
removed.
installation_location/bin/nodetool cleanup cycling
-
Run the nodetool snapshot command, specifying the hostname, JMX port, and
keyspace. For example:
nodetool snapshot -t cycling_2017-3-9 cycling
Results
The name of the snapshot directory
appears:
Requested creating snapshot(s) for [cycling] with snapshot name [2015.07.17]
Snapshot directory: cycling_2017-3-9
The snapshot files are created in
data/keyspace_name/table_name-UUID/snapshots/snapshot_name
directory.
ls -1 data/cycling/cyclist_name-9e516080f30811e689e40725f37c761d/snapshots/cycling_2017-3-9
The default location of the data directory is installation_location/data/data
The data files extension is .db and the full CQL to create the
table is in the schema.cql
file.
manifest.json
mc-1-big-CompressionInfo.db
mc-1-big-Data.db
mc-1-big-Digest.crc32
mc-1-big-Filter.db
mc-1-big-Index.db
mc-1-big-Statistics.db
mc-1-big-Summary.db
mc-1-big-TOC.txt
schema.cql