nodetool scrub
Creates a snapshot and then rebuilds SSTables for one or more tables on a node.
If possible, use nodetool upgradesstables instead of
scrub.
Scrub automatically discards broken data and removes any tombstone rows that have exceeded the grace period of the table. If partition key values do not match the column data type, then the partition is considered corrupt and the process automatically stops.
The sstablescrub utility is an offline version of nodetool scrub.
|
For LeveledCompactionStrategy (LCS), |
Synopsis
nodetool [<connection_options>] scrub
[-j <num_jobs>] [-n] [-ns] [-o] [-r] [-s]
[--] [<keyspace_name> <table_name> [<table_name> ...]]
Syntax legend
| Syntax conventions | Description |
|---|---|
Italic, bold, or |
Syntax diagrams and code samples use one or more of these styles to mark placeholders for variable values. Replace placeholders with a valid option or your own user-defined value. In CQL statements, it is a requirement to use angle brackets to enclose data types in a set, list, map, or tuple.
Separate the data types with a comma.
For example: In Search CQL statements, use angle brackets to identify the entity and literal value to use when
overwriting the XML element in the schema and |
|
Square brackets surround optional command arguments. Do not type the square brackets. |
|
Parentheses identify a group to choose from. Do not type the parentheses. |
|
A pipe separates alternative elements. Type any one of the elements. Do not type the pipe. |
|
Indicates that you can repeat the syntax element as often as required. |
|
Use single quotation marks to surround literal strings in CQL statements. Use single quotation marks to preserve uppercase. For Search CQL only: Single quotation marks surround an entire XML schema declaration, such as |
|
Map collection.
Curly braces enclose maps ( |
|
Ends a CQL statement. |
|
Separate command line options from command arguments with two hyphens. This syntax is useful when arguments might be mistaken for command line options. |
Options
If an option has a short and long form, both forms are given, separated by a comma.
- -h, --host hostname
-
The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.
- -p, --port jmx_port
-
The JMX port number.
- -pw, --password jmxpassword
-
The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.
- -pwf, --password-file jmx_password_filepath
-
The filepath to the file that stores JMX authentication credentials.
- -u, --username jmx_username
-
The username for authenticating with secure JMX.
- -j, --jobs
-
Specify the number of SSTables affected simultaneously. Set to 0 to use all available compaction threads.
Default: 2
- keyspace_name
-
The keyspace name.
- -n, --no-validate
-
Do not validate columns using column validator.
- -ns, --no-snapshot
-
Take a snapshot of scrubbed column families (CF)s first as long as this
disableSnapshotoption is set to false (default).Default:
false - -o, --overwrite-ttl
-
Adjust the time-to-live setting while cleaning up SSTables.
Takes the following arguments:
-
NONE: Do not overwrite the TTL values in the data. -
NO_TTL: Remove any and all time-to-live values from the data to scrub.Use this option in cases such as when the TTL:
-
is set for immediate expiration as you prepare to restore backed up SSTables.
-
is wrong after the insertion of data.
For example:
-
You must disable compaction on the node.
Disable compaction on the node with
nodetool disableautocompaction.nodetool disableautocompactionThis step is crucial because otherwise the data may be removed permanently during compaction.
-
Copy the SSTables containing entries with overflowed expiration TTL to the data directory.
-
Run
nodetool importin a live node to load the recovered SSTables. -
Run (with name value substitutions):
nodetool scrub --overwrite-ttl NO_TTL KEYSPACE_NAME TABLE_NAME -
Verify that
nodetool scrubremoved TTL assignments. -
Verify that
nodetool scrubrecovered the missing entries. -
Re-enable compactions.
-
-
REINSERT_OVERFLOWED_TTL: This argument is functionally the same as theREINSERT_OVERFLOWED_TTLoption.
-
Default: NONE
- -r, --reinsert-overflowed-ttl
-
Rewrite rows with overflowed expiration date affected by CASSANDRA-14092 with the maximum supported expiration date of
2038-01-19T03:14:06+00:00. Rewrites rows with the original timestamp incremented by one millisecond to override or supersede any potential tombstone that may have been generated during compaction of the affected rows. - -s, --skip-corrupted
-
Skip corrupted partitions even when scrubbing counter tables.
Default is false.
- table_name
-
One or more table names, separated by a space.
Examples
Scrub a single table on a keyspace
nodetool scrub cycling cyclist_id
Scrub an entire keyspace while skipping corrupted partitions
nodetool scrub -s cycling