sstablescrub

An offline version of nodetool scrub. It attempts to remove the corrupted parts while preserving non-corrupted data.

The sstablescrub utility is an offline version of nodetool scrub. It attempts to remove the corrupted parts while preserving non-corrupted data. Because sstablescrub runs offline, it can correct errors that nodetool scrub cannot. If an SSTable cannot be read due to corruption, it will be left on disk.

If scrubbing results in dropping rows, new SSTables become unrepaired. However, if no bad rows are detected, the SSTable keeps its original repairedAt field, which denotes the time of the repair.

Procedure

  1. Before using sstablescrub, try rebuilding the tables using nodetool scrub.

    If nodetool scrub does not fix the problem, use this utility.

  2. Shut down the node.
  3. Run the utility:
    • DataStax Enterprise 5.0 Installer Services and package installations:
      sstablescrub [options] keyspace table
    • DataStax Enterprise 5.0 Installer No-Services and tarball installations:
      cd install_location/resources/cassandra
      $ bin/sstablescrub [options] keyspace table
    • Cassandra package installations:
      sstablescrub [options] keyspace table
    • Cassandra tarball installations:
      cd install_location
      $ bin/sstablescrub [options] keyspace table
    --debug 
    Display stack traces.
    -h, --help 
    Display help.
    -m, --manifest-check 
    Only check and repair the leveled manifest, without actually scrubbing the SSTables.
    --reinsert-overflowed-ttl 
    Rewrite SSTables containing rows with overflowed expiration time with the maximum expiration date of 2038-01-19T03:14:06+00:00 using the original timestamp + 1 (ms).
    -s, --skip-corrupted 
    Skip corrupt rows in counter tables.
    -v, --verbose
    Verbose output.