sstablescrub

An offline version of nodetool scrub. It attempts to remove the corrupted parts while preserving non-corrupted data.

The sstablescrub utility is an offline version of nodetool scrub. It attempts to remove the corrupted parts while preserving non-corrupted data. Because sstablescrub runs offline, it can correct errors that nodetool scrub cannot. If an SSTable cannot be read due to corruption, it will be left on disk.

If scrubbing results in dropping rows, new SSTables become unrepaired. However, if no bad rows are detected, the SSTable keeps its original repairedAt field, which denotes the time of the repair.

Cassandra tools directory

The default location of the Cassandra tools depends on the type of installation:
  • Package installations: /usr/bin/
  • Tarball installations: installation_location/resources/cassandra/tools/bin

Procedure

  1. Before using sstablescrub, try rebuilding the tables using nodetool scrub.

    If nodetool scrub does not fix the problem, use this utility.

  2. Shut down the node.
  3. Run the utility:
    sstablescrub [options] keyspace table
    SStable tools are located in the Cassandra tools directory.
    Tip: SSTable tools work offline from the DataStax Enterprise database. If you need to pass a JVM parameter, specify it in the command line. For example, to change the max heap size:
    MAX_HEAP=2g sstabletoolname
    Table 1. Options
    Flag Option Description
    --debug Display stack traces.
    -h --help Display help.
    -m --manifest-check Only check and repair the leveled manifest, without actually scrubbing the SSTables.
    --reinsert-overflowed-ttl Rewrite SSTables containing rows with overflowed expiration time with the maximum expiration date of 2038-01-19T03:14:06+00:00 using the original timestamp + 1 (ms).
    -s --skip-corrupted Skip corrupt rows in counter tables.
    -v --verbose Verbose output.