Migrating to incremental repairs
To start using incremental repairs, migrate the SSTables on each node.
Repairing SSTables using anti-entropy repair is a necessary part of Cassandra maintenance. A full repair of all SSTables on a node takes a lot of time and is resource-intensive. You can manage repairs with less service disruption usingIncremental repairs. Incremental repair consumes less time and resources because it skips SSTables that are already marked as repaired.
Incremental repair works equally well with any compaction scheme — Size-Tiered Compaction (STCS), Date-Tiered Compaction(DTCS) or Leveled Compaction (LCS).
However, Cassandra's default is full repair: a new SSTable is created without metadata that identifies its repaired state. Before you can start using incremental repairs, you must add this marker to each SSTable on each node in the cluster. Follow these instructions to migrate the cluster to incremental repair gradually, one node at a time.
Overview of the procedure
- Disable autocompaction on the node.
- Run the default full, sequential repair.
- Stop the node.
- Mark as
repaired
all the SSTables that existed before you disabled compaction. - Restart Cassandra on the node.
- Re-enable autocompaction on the node.
Prerequisites
Before you run a full repair on the node,
list its SSTables. The existing SSTables may not be changed by the repair process,
and the incremental repair process you run later will not process these SSTables
unless you mark each one as repaired
(see Step 4
below).
You can find the node's SSTables in one of the following locations:
- Package installations: /var/lib/cassandra
- Tarball installations: install_location/data/data
<version_code>-<generation>-<format>-Data.db
Migrating the node to incremental repair
- Disable autocompaction on the nodeFrom the install_directory:
$ bin/nodetool disableautocompaction
Running this command without parameters disables autocompaction for all keyspaces. For details, see nodetool disableautocompaction.
- Run the default full, sequential repairFrom the install_directory:
$ bin/nodetool repair
Running this command without parameters starts a full sequential repair of all SSTables on the node. This may take a substantial amount of time. For details, see nodetool repair.
- Stop the node.
- Mark as
repaired
all the SSTables that were created before you disabled compaction.Use sstablerepairedset. To mark a single SSTable SSTable-example-Data.db:sudo bin/sstablerepairedset --is-repaired SSTable-example-Data.db
To do this as a batch process using a text file of SSTable names:sudo bin/sstablerepairedset --is-repaired -f SSTable-names.txt
Note: The value of the repaired metadata is the timestamp of the last repair. The sstablerepairedset command applies the current date/time. To check the value of therepaired
metadata for an SSTable, use:$ bin/sstablemetadata example-keyspace-SSTable-example-Data.db | grep "Repaired at"
- Restart the node.
- Re-enable autocompaction on the node.From the install_directory:
$ bin/nodetool enableautocompaction
Running this command without parameters enables autocompaction for all keyspaces and tables. For details, see nodetool enableautocompaction.
What's next
After you have migrated all nodes, you will be able to run incremental repairs using nodetool repair with the -inc parameter. For details, see https://www.datastax.com/blog/2014/02/more-efficient-repairs-21.