nodetool scrub

Creates a snapshot and then rebuilds SSTables on a node. If possible use nodetool upgradesstables instead of scrub.

Scrub automatically discards broken data and removes any tombstoned rows that have exceeded gc_grace period of the table. If partition key values do not match the column data type, the partition is considered corrupt and the process automatically stops.

When using LCS, resets all SSTables back to Level 0 and requires recompaction of all SSTables.

Synopsis

nodetool [<connection_options>] scrub [(-j <jobs> | --jobs <jobs>)] \
[(-n | --no-validate)][(-ns | --no-snapshot)] \
[ --reinsert-overflowed-ttl ] \
[(-s | --skip-corrupted)] \
[--] [<keyspace> <tables>...]

Connection options

Connection options specify how to connect and authenticate for all nodetool commands:

Connection options
Short Long Description

-h

--host

Hostname or IP address.

-p

--port

Port number.

-pwf

--password-file

Password file path.

-pw

--password

Password.

-u

--username

Username.

--

Separates command parameters from a list of options.

  • If a username and password for RMI authentication are set explicitly in the cassandra-env.sh file for the host, then you must specify credentials.

  • The repair and rebuild commands can affect multiple nodes in the cluster.

  • Most nodetool commands operate on a single node in the cluster if -h is not used to identify one or more other nodes. If the node from which you issue the command is the intended target, you do not need the -h option to identify the target; otherwise, for remote invocation, identify the target node, or nodes, using -h.

Scrub parameters

Use the following parameters with the scrub command:

-j | --jobs jobs

Number of SSTables to simultaneously scrub. Zero (0) uses all available compaction threads.

Default: 2.

-n | --no-validate

Suppresses validation of columns.

Default: Validate all columns.

-ns --no-snapshot

Suppresses creation of snapshot.

Default: Create a snapshot before rebuilding SSTables.

--reinsert-overflowed-ttl

Rewrites SSTables containing rows with overflowed expiration time with the maximum expiration date of 2038-01-19T03:14:06+00:00 using the original timestamp + 1 (ms).

-s | --skip-corrupted

Forces scrub to skip corrupt partitions and continue. Corrupt partitions have a column value that does not match the column data type. Logs skipped partitions in the system.log.

Default: Stop scrubbing if a corrupted partition is detected.

Skipping corrupted partitions on tables with counter columns results in under-counting.

keyspace_name table_name […​]

Identifies the keyspace and targets specific tables using a space separated list.

Default: Include all keyspaces and tables on the node when no arguments are specified.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com