sstablescrub

The sstablescrub utility is an offline version of nodetool scrub. It attempts to remove the corrupted parts while preserving non-corrupted data. Because sstablescrub runs offline, it can correct errors that nodetool scrub cannot. If an SSTable cannot be read due to corruption, it is left on disk.

If scrubbing results in dropping rows, new SSTables become unrepaired. However, if no bad rows are detected, the SSTable keeps its original repairedAt field, which denotes the time of the repair.

Procedure

  1. Before using sstablescrub, try rebuilding the tables using nodetool scrub.

    If nodetool scrub does not fix the problem, use sstablescrub.

  2. Shut down the node.

  3. Run the utility:

    sstablescrub [--debug] [-e arg] [-h] [-j arg] [-m] [-n] [-r] [-s] [-t arg] [-v]
                 keyspace_name table_name
    --debug

    Display stack traces.

    -e, --header-fix argument

    Check SSTable serialization-headers and repair issues. Takes the following arguments:

    • validate-only

      Validate serialization-headers only. Do not attempt any repairs and do not continue with the scrub once the validation is complete.

    • validate

      Validate serialization-headers and continue with the scrub once the validation is complete. (Default)

    • fix-only

      Validate and repair only the serialization-headers. Do not continue with the scrub once serialization-header validation and repairs are complete.

    • fix

      Validate and repair serialization-headers and perform a normal scrub. Do not repair and do not continue with the scrub if serialization-header validation encounters errors.

    • off

      Do not perform serialization-header validation checks.

    • -h, --help

      Display help.

    • -j, --jobs

      Number of sstables to scrub simultaneously. Defaults to the minimum between either the number of available processors and 8.

    • -m, --manifest-check

      Only check and repair the leveled manifest, without actually scrubbing the SSTables.

    • --reinsert-overflowed-ttl

      Rewrites SSTables containing rows with overflowed expiration time with the maximum expiration date of 2038-01-19T03:14:06+00:00 using the original timestamp + 1 (ms).

    • -s, --skip-corrupted

      Skip corrupt rows in counter tables.

    • -t

      Given a number of days from 1 to 1000, examines all deletion times and changes the timestamp and local-deletion-time to now if any deletion times are at least the number of days in the future specified by the argument.

      This is a destructive operation and should only be used under the guidance of DataStax support.

    • -v, --verbose

      Verbose output.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com