Estimating remaining repair time
If the Repair Service anticipates that it cannot complete a repair cycle within the allotted time to completion due to throughput, it displays a warning message and a newly estimated time remaining to complete the repair cycle. The Repair Service does not adjust the configured time to completion; it reports the revised estimate for completion without stopping the repair in progress.
When the Repair Service estimates that it will not finish a repair cycle within the configured time_to_completion
, it triggers an ALERT in the OpsCenter Event Log.
The alert is also visible in the opscenterd.log, as well as the Event Log in the Activities section of the OpsCenter UI.
The location of the opscenterd.log file depends on the type of installation:
-
Package installations: /var/log/opscenter/opscenterd.log
-
Tarball installations: install_location/log/opscenterd.log If email alerts or post-url alert notifications are configured, the alert notifications are emailed or posted.
The error_logging_window configuration property controls both how often to log the message and how often to fire the alert if the Repair Service continues to estimate that it will not finish a repair in time.
Parameters
The time_to_completion parameter is the maximum amount of time it takes to repair the entire cluster one time.
Typically, you should set the Time to Completion to a value lower than the lowest grace seconds before garbage collection setting ( |
The Repair Service might run multiple subrange repairs in parallel, but runs as few as needed to complete within the amount of time specified. The Repair Service always avoids running more than one repair within a single replica set; there is no overlap in repairs between replica sets.