sstablescrub
An offline version of nodetool scrub. It attempts to remove the corrupted parts while preserving non-corrupted data.
The sstablescrub utility is an offline version of nodetool
scrub. It attempts to remove the corrupted parts while preserving
non-corrupted data. Because sstablescrub runs offline, it can correct errors that
nodetool scrub
cannot. If an SSTable cannot be read due to
corruption, it will be left on disk.
If scrubbing results in dropping rows, new SSTables become unrepaired. However, if no
bad rows are detected, the SSTable keeps its original repairedAt
field, which denotes the time of the repair.
Procedure
-
Before using
sstablescrub
, try rebuilding the tables usingnodetool scrub
.If
nodetool scrub
does not fix the problem, use sstablescrub. - Shut down the node.
-
Run the utility:
sstablescrub [--debug] [-e arg] [-h] [-j arg] [-m] [-n] [-r] [-s] [-v] keyspace_name table_name [-sstable-files arg]
- --debug
- Display stack traces.
- -e, --header-fix argument
- Check SSTable serialization-headers and repair issues. Takes the following arguments:
- validate-only
- Validate serialization-headers only. Do not attempt any repairs and do not continue with the scrub once the validation is complete.
- validate
- Validate serialization-headers and continue with the scrub once the validation is complete. (Default)
- fix-only
- Validate and repair only the serialization-headers. Do not continue with the scrub once serialization-header validation and repairs are complete.
- fix
- Validate and repair serialization-headers and perform a normal scrub. Do not repair and do not continue with the scrub if serialization-header validation encounters errors.
- off
- Do not perform serialization-header validation checks.
- -h, --help
- Display help.
- -j, --jobs
-
Number of sstables to scrub simultaneously. Defaults to the minimum between either the number of available processors and 8.
- -m, --manifest-check
- Only check and repair the leveled manifest, without actually scrubbing the SSTables.
- --reinsert-overflowed-ttl
- Rewrites SSTables containing rows with overflowed expiration time
with the maximum expiration date of
2038-01-19T03:14:06+00:00
using the original timestamp + 1 (ms). - -s, --skip-corrupted
- Skip corrupt rows in counter tables.
- --sstable-files
- Instead of processing all SSTables in the default data directories, process only the tables specified via this option. If a single SSTable file, only that SSTable is processed. If a directory is specified, all SSTables within that directory are processed. Snapshots and backups are not supported with this option.
- -v, --verbose
- Verbose output.