Incremental repairs overview
Summary of incremental repair behavior and available configuration options.
Incremental repairs
Incremental repairs only repair data that has not been previously repaired on tables reserved and configured for incremental repair.
After incremental repairs have completed for an entire cluster, the Repair Service sleeps for an appointed time. When the incremental threshold of unrepaired data is reached (DSE 5.1 and later), it triggers an incremental repair only on designated tables that meet the criteria. Repairing an entire cluster one time is referred to as a repair cycle.
Incremental repairs run in a singular sequential manner and do not run in parallel. The
Repair Service coordinates incremental repairs with subrange repairs. If the
max_parallel_repairs
option is set to 1, subrange repairs and incremental
repairs alternate running tasks one-at-a-time, waiting for a subrange repair to complete
before starting an incremental repair and vice versa. Doing so can be helpful for isolating
repair issues.
Restricting by datacenter and racks
Specify the datacenters and racks by which to restrict incremental repairs using the
incremental_repair_datacenters
option. Restricting repairs by datacenter
or racks improves repair performance in a multi-DC cluster with replicated keyspaces in both
datacenters. Repairs complete faster with fewer repair tasks to process.
Including tables
Specify the specific tables to include for incremental repairs in the
incremental_repair_tables
option. The OpsCenter.settings
and OpsCenter.backup_reports
tables are included by default.
Threshold of unrepaired data
The Repair Service only repairs a table designated as candidate for incremental repair if
the amount of unrepaired data is above a certain threshold, which is 1 KB by default.
Configure the threshold with the incremental_threshold
option.
When the DSE version is at least 5.1 or later, the Repair Service takes an extra step of
excluding any tables from the incremental_repair_tables
option that do not
meet the threshold criteria. When an incremental repair ends, the Repair Service checks
every table in the incremental repair tables list against the threshold before starting the
next repair on tables that qualify for repair. The threshold option allows for more
selective incremental repairs.
If the DSE version is earlier than 5.1, the Repair Service always repairs every table configured for incremental repairs.
Ignore incremental errors threshold
The threshold for ignoring errors before alerting is set to a default of 20. Configure the
threshold with the incremental_err_alert_threshold
to adjust the tolerated
level of incremental repair error alerts for your environment.
Sleep between incremental repair cycles
After completing all incremental repairs, the Repair Service suspends incremental repairs
for a fixed interval (one hour by default) until starting again. The sleep time can be
configured with the incremental_sleep
option.
Incremental repair progress
Observe the progress of incremental repairs using the SSTable repaired metrics available in the dashboard graphs. See Tracking repaired SSTables for incremental repairs.
The Repair Status tab displays a progress bar when an incremental repair is running.
For more information and configuration examples, see Configuring incremental repairs.