When to run anti-entropy repair

When to run anti-entropy repair is dependent on the characteristics of the cluster. General guidelines are presented here, and should be tailored to each particular case.

An understanding of how repair works is required to fully understand the information presented on this page, see Anti-entropy repair.

On this page:

When is repair needed?
Guidelines for running routine node repair
Guideline for running repair on a downed node

When is repair needed?

Run repair in these situations:

Routinely to maintain node health.

Even if deletions never occur, schedule regular repairs. Setting a column to null is a delete.
When recovering a node after a failure while bringing it back into the cluster.
To update data on a node containing infrequently read data, and subsequently does not get read repair.
To update data on a downed node.
When recovering missing data or corrupted SSTables. You must run non-incremental repair.

Guidelines for running routine node repair

Run full repairs weekly to monthly. Monthly is generally sufficient, but run more frequently if warranted.

Full repair is useful for maintaining data integrity, even if deletions never occur.
Use the parallel and partitioner range options, unless precluded by the scope of the repair.
Migrate off incremental repairs and then run a full repair to eliminate anti-compaction. Anti-compaction is the process of splitting an SSTable into two SSTables, one with repaired data and one with non-repaired data. This has compaction strategy implications.

If you are on DataStax Enterprise version 5.1.0-5.1.2, DataStax recommends upgrading to 5.1.3 or higher.
Run repair frequently enough that every node is repaired before reaching the time specified in the gc_grace_seconds setting. If this requirement is met, deleted data is properly handled in the cluster.
Schedule routine node repair operations to minimize cluster disruption during low-usage hours and on one node at a time:
Increase the time value setting of gc_grace_seconds if data is seldom deleted or overwritten. For these tables, changing the setting minimizes impact to disk space and provides a longer interval between repair operations.
Mitigate heavy disk usage by configuring nodetool compaction throttling options (setcompactionthroughput and setcompactionthreshold) before running a repair.

Guideline for running repair on a downed node

Do not use partitioner range, -pr.
Do not use incremental repair, -inc.

When to run anti-entropy repair

When is repair needed?

Guidelines for running routine node repair

Guideline for running repair on a downed node

Was this helpful?

Give Feedback