Migrating to incremental repairs
To start using incremental repairs, migrate the SSTables on each node.
Repairing SSTables using anti-entropy repair is a necessary part of Cassandra maintenance. A full repair of all SSTables on a node takes a lot of time and is resource-intensive. You can manage repairs with less service disruption using incremental repair. Incremental repair consumes less time and resources because it skips SSTables that are already marked as repaired.
Incremental repair works equally well with any compaction scheme — Size-Tiered Compaction (STCS), Date-Tiered Compaction(DTCS), Time-Window Compaction(TWCS), or Leveled Compaction (LCS).
In Cassandra 3.0 and later, switching from full repair to incremental repair is easier than before. However, the first system-wide incremental repair can take a long time, as Cassandra recompacts all SSTables according to the chosen compaction scheme. You can make this process less disruptive by migrating the cluster to incremental repair one node at a time.
Overview of the procedure
- Disable autocompaction on the node.
- Run a full, sequential repair.
- Stop the node.
- Set the
repairedAt
metadata value to each SSTable that existed before you disabled compaction. - Restart Cassandra on the node.
- Re-enable autocompaction on the node.
Prerequisites
Before you run a full repair on the node,
list its SSTables. The existing SSTables may not be changed by the repair process,
and the incremental repair process you run later will not process these SSTables
unless you set the repairedAt
value for each SSTable (see Step 4
below).
You can find the node's SSTables in one of the following locations:
- Cassandra package installations: /var/lib/cassandra/data
- Cassandra tarball installations: install_location/data/data
<version_code>-<generation>-<format>-Data.db
Migrating the node to incremental repair
- Disable autocompaction on the nodeFrom the install_directory:
bin/nodetool disableautocompaction
Running this command without parameters disables autocompaction for all keyspaces. For details, see nodetool disableautocompaction.
- Run the default full, sequential repairFrom the install_directory:
bin/nodetool repair
Running this command without parameters starts a full sequential repair of all SSTables on the node. This may take a substantial amount of time. For details, see nodetool repair.
- Stop the node.
- Set the
repairedAt
metadata value to each SSTable that existed before you disabled compaction.Use sstablerepairedset. To mark a single SSTable SSTable-example-Data.db:sudo bin/sstablerepairedset --really-set --is-repaired SSTable-example-Data.db
To do this as a batch process using a text file of SSTable names:sudo bin/sstablerepairedset --really-set --is-repaired -f SSTable-names.txt
Note: The value of the repairedAt metadata is the timestamp of the last repair. The sstablerepairedset command applies the current date/time. To check the value of therepairedAt
metadata for an SSTable, use:bin/sstablemetadata example-keyspace-SSTable-example-Data.db | grep "Repaired at"
- Restart the node.
What's next
After you have migrated all nodes, you will be able to run incremental repairs using nodetool repair with the -inc option. For details, see https://www.datastax.com/blog/2014/02/more-efficient-repairs-21.