When to run anti-entropy repair
When should anti-entropy repair be run on nodes.
When to run anti-entropy repair is dependent on the characteristics of the cluster. General
guidelines are presented here, and should be tailored to each particular case.
Note: An
understanding of how repair works is required to fully understand the information presented
on this page, see .
When is repair needed?
Run repair in these situations:
- Routinely to maintain node health.Note: Even if deletions never occur, schedule regular repairs. Setting a column to null is a delete.
- When recovering a node after a failure while bringing it back into the cluster.
- To update data on a node containing infrequently read data, and subsequently does not get read repair.
- To update data on a downed node.
- When recovering missing data or corrupted SSTables. You must run non-incremental repair.
Guidelines for running routine node repair
- Run full repairs weekly to monthly. Monthly is generally sufficient, but run more
frequently if warranted.Important: Full repair is useful for maintaining data integrity, even if deletions never occur.
- Use the parallel and partitioner range options, unless precluded by the scope of the repair.
- Migrate off incremental repairs and then run a full repair to eliminate anti-compaction.
Anti-compaction is the process of splitting an SSTable into two SSTables, one with
repaired data and one with non-repaired data. This has compaction strategy implications.Note: If you are on DataStax Enterprise version 5.1.0-5.1.2, DataStax recommends upgrading to 5.1.3 or later.
- Run repair frequently enough that every node is repaired before reaching the time specified in the gc_grace_seconds setting. If this requirement is met, deleted data is properly handled in the cluster.
- Schedule routine node repair operations to minimize cluster disruption during low-usage hours and on one node at a time:
- Increase the time value setting of gc_grace_seconds if data is seldom deleted or overwritten. For these tables, changing the setting minimizes impact to disk space and provides a longer interval between repair operations.
- Mitigate heavy disk usage by configuring nodetool compaction throttling options (setcompactionthroughput and setcompactionthreshold) before running a repair.
Guideline for running repair on a downed node
- Do not use partitioner range,
-pr
. - Do not use incremental repair,
-inc
.