nodetool scrub

Creates a snapshot and then rebuilds SSTables for one or more tables on a node. If possible, use nodetool upgradesstables instead of scrub.

Scrub automatically discards broken data and removes any tombstone rows that have exceeded the grace period of the table. If partition key values do not match the column data type, then the partition is considered corrupt and the process automatically stops.

The sstablescrub utility is an offline version of nodetool scrub.

For LeveledCompactionStrategy (LCS), nodetool scrub resets all SSTables back to Level 0 and requires recompaction of all SSTables.

Synopsis

nodetool [<connection_options>] scrub
[-j <num_jobs>] [-n] [-ns] [-o] [-r] [-s]
[--] [<keyspace_name> <table_name> [<table_name> ...]]
Syntax legend
Syntax conventions Description

Italic, bold, or < >

Syntax diagrams and code samples use one or more of these styles to mark placeholders for variable values. Replace placeholders with a valid option or your own user-defined value.

In CQL statements, it is a requirement to use angle brackets to enclose data types in a set, list, map, or tuple. Separate the data types with a comma. For example: '<<datatype1>,<datatype2>>'

In Search CQL statements, use angle brackets to identify the entity and literal value to use when overwriting the XML element in the schema and solrconfig files, such as @<xml_entity>='<xml_entity_type>'.

[ ]

Square brackets surround optional command arguments. Do not type the square brackets.

( )

Parentheses identify a group to choose from. Do not type the parentheses.

|

A pipe separates alternative elements. Type any one of the elements. Do not type the pipe.

...

Indicates that you can repeat the syntax element as often as required.

'

Use single quotation marks to surround literal strings in CQL statements. Use single quotation marks to preserve uppercase.

For Search CQL only: Single quotation marks surround an entire XML schema declaration, such as '<<schema> ... </schema>>'

{ }

Map collection. Curly braces enclose maps ({ <key_datatype>:<value_datatype> }) or key value pairs ({ <key>:<value> }). A colon separates the key and the value.

;

Ends a CQL statement.

--

Separate command line options from command arguments with two hyphens. This syntax is useful when arguments might be mistaken for command line options.

Options

If an option has a short and long form, both forms are given, separated by a comma.

-h, --host hostname

The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.

-p, --port jmx_port

The JMX port number.

-pw, --password jmxpassword

The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.

-pwf, --password-file jmx_password_filepath

The filepath to the file that stores JMX authentication credentials.

-u, --username jmx_username

The username for authenticating with secure JMX.

-j, --jobs

Specify the number of SSTables affected simultaneously. Set to 0 to use all available compaction threads.

Default: 2

keyspace_name

The keyspace name.

-n, --no-validate

Do not validate columns using column validator.

-ns, --no-snapshot

Take a snapshot of scrubbed column families (CF)s first as long as this disableSnapshot option is set to false (default).

Default: false

-o, --overwrite-ttl

Adjust the time-to-live setting while cleaning up SSTables.

Takes the following arguments:

  • NONE: Do not overwrite the TTL values in the data.

  • NO_TTL: Remove any and all time-to-live values from the data to scrub.

    Use this option in cases such as when the TTL:

    • is set for immediate expiration as you prepare to restore backed up SSTables.

    • is wrong after the insertion of data.

    For example:

    1. You must disable compaction on the node.

      Disable compaction on the node with nodetool disableautocompaction.

      nodetool disableautocompaction

      This step is crucial because otherwise the data may be removed permanently during compaction.

    2. Copy the SSTables containing entries with overflowed expiration TTL to the data directory.

    3. Run nodetool import in a live node to load the recovered SSTables.

    4. Run (with name value substitutions):

      nodetool scrub --overwrite-ttl NO_TTL KEYSPACE_NAME TABLE_NAME
    5. Verify that nodetool scrub removed TTL assignments.

    6. Verify that nodetool scrub recovered the missing entries.

    7. Re-enable compactions.

  • REINSERT_OVERFLOWED_TTL: This argument is functionally the same as the REINSERT_OVERFLOWED_TTL option.

Default: NONE

-r, --reinsert-overflowed-ttl

Rewrite rows with overflowed expiration date affected by CASSANDRA-14092 with the maximum supported expiration date of 2038-01-19T03:14:06+00:00. Rewrites rows with the original timestamp incremented by one millisecond to override or supersede any potential tombstone that may have been generated during compaction of the affected rows.

-s, --skip-corrupted

Skip corrupted partitions even when scrubbing counter tables.

Default is false.

table_name

One or more table names, separated by a space.

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2026 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM