Estimating remaining repair time
The Repair Service updates repair estimates for completion time without stopping the repair progress.
opscenterd.log
The location of the opscenterd.log file depends on the type of installation:- Package installations: /var/log/opscenter/opscenterd.log
- Tarball installations: install_location/log/opscenterd.log
If the Repair Service anticipates it cannot complete a repair cycle within the allotted time to completion due to throughput, it displays a warning message and a newly estimated time remaining to complete the repair cycle. The Repair Service does not adjust the configured time to completion; it reports the revised estimate for completion without stopping the repair in progress.
When the Repair Service estimates that it will not finish a repair cycle within the configured
time_to_completion
, it triggers an ALERT in the OpsCenter Event Log. The alert
is also visible in the opscenterd.log, as well as the Event Log
in the Activities section of the OpsCenter UI. If email
alerts or post-url alert notifications are
configured, the alert notifications are emailed or posted.
The error_logging_window configuration property controls both how often to log the message and how often to fire the alert if the Repair Service continues to estimate that it will not finish a repair in time.
Parameters
The time_to_completion parameter is the maximum amount of time it takes to repair the entire cluster one time.
gc_grace_seconds
) on
your tables. The default for gc_grace_seconds
is 10 days
(864000 seconds). OpsCenter provides an estimate by checking
gc_grace_seconds
across all tables and calculating 90%
of the lowest value. The default estimate for the time to completion based
on the typical grace seconds default is 9 days. For more information about
configuring grace seconds, see gc_grace_seconds in the CQL
documentation.The Repair Service might run multiple subrange repairs in parallel, but runs as few as needed to complete within the amount of time specified. The Repair Service always avoids running more than one repair within a single replica set; there is no overlap in repairs between replica sets.