sstablescrub
Scrubs the SSTable for the provided table.
The sstablescrub utility is an offline version of nodetool scrub.
It attempts to remove the corrupted parts while preserving non-corrupted data.
Because sstablescrub runs offline, it can correct errors that nodetool scrub cannot.
If an SSTable cannot be read due to corruption, it will be left on disk.
If scrubbing results in dropping rows, then new SSTables become unrepaired.
However, if no bad rows are detected, then the SSTable keeps its original repairedAt field, which denotes the time of the repair.
|
Stop DSE before you run this command. |
The default location of this SSTable tool depends on the type of installation:
-
Package installations:
/usr/bin/ -
Tarball installations:
INSTALL_DIRECTORY/resources/cassandra/tools/bin
Synopsis
sstablescrub
[--debug] [-e <arg>] [-h] [-j <arg>] [-m] [-n] [-r] [-s] [-t <number of days>] [-v]
<keyspace_name> <table_name> [-sstable-files <arg>]
Syntax legend
| Syntax conventions | Description |
|---|---|
Italic, bold, or |
Syntax diagrams and code samples use one or more of these styles to mark placeholders for variable values. Replace placeholders with a valid option or your own user-defined value. In CQL statements, angle brackets are required to enclose data types in a set, list, map, or tuple.
Separate the data types with a comma.
For example: In Search CQL statements, angle brackets are used to identify the entity and literal value to overwrite the XML element in the schema and |
|
Square brackets surround optional command arguments. Do not type the square brackets. |
|
Parentheses identify a group to choose from. Do not type the parentheses. |
|
A pipe separates alternative elements. Type any one of the elements. Do not type the pipe. |
|
Indicates that you can repeat the syntax element as often as required. |
|
Single quotation marks must surround literal strings in CQL statements.
Use single quotation marks to preserve upper case.
+
For Search CQL only: Single quotation marks surround an entire XML schema declaration, such as |
|
Map collection.
Curly braces enclose maps ( |
|
Ends a CQL statement. |
|
Separate command line options from command arguments with two hyphens. This syntax is useful when arguments might be mistaken for command line options. |
Options
If an option has a short and long form, both forms are given, separated by a comma.
- --debug
-
Display stack traces.
- -e, --header-fix argument
-
Check SSTable serialization-headers and repair issues:
-
validate-only: Validate serialization-headers only. Do not attempt any repairs and do not continue with the scrub once the validation is complete.
-
validate (default): Validate serialization-headers and continue with the scrub once the validation is complete.
-
fix-only: Validate and repair only the serialization-headers. Do not continue with the scrub once serialization-header validation and repairs are complete.
-
fix: Validate and repair serialization-headers and perform a normal scrub. Do not repair and do not continue with the scrub if serialization-header validation encounters errors.
-
off: Do not perform serialization-header validation checks.
-
- -h, --help
-
Display the usage and listing of the commands.
- -j, --jobs
-
Number of sstables to scrub simultaneously. Defaults to the minimum between either the number of available processors and 8.
- keyspace_name
-
Keyspace name. Required.
- -m, --manifest-check
-
Check and repair only the leveled manifest. Do not scrub the SSTables.
- -n, --no-validate
-
Do not validate columns using column validator.
- -r, --reinsert-overflowed-ttl
-
Rewrite rows with overflowed expiration date affected by CASSANDRA-14092 with the maximum supported expiration date of
2038-01-19T03:14:06+00:00. Rows are rewritten with the original timestamp incremented by one millisecond to override/supersede any potential tombstone that might have been generated during compaction of the affected rows. See Recovering expired data caused by TTL year 2038 problem. - -s, --skip-corrupted
-
Skips corrupt rows in counter tables.
- --sstable-files
-
Instead of processing all SSTables in the default data directories, process only the tables specified via this option. If a single SSTable file, only that SSTable is processed. If a directory is specified, all SSTables within that directory are processed. Snapshots and backups are not supported with this option.
- table_name
-
Table name. Required.
- -t
-
This is a destructive operation and should only be used under the guidance of DataStax Support.
The only time to use
-tis when the system clock on a node is in the future, because that makes the tombstone unpurgeable.Provide a number of days from 1 to 1000.
sstablescrubexamines all deletion times, and changes the value oftimestampandlocal-deletion-timetonowif any deletion times are equal to or greater than the specified number of days in the future. All deletion times that extend into the future beyond the given number of days are reset to the current time.Default: 0 (disabled)
For example:
sstablescrub -v -t 1 <keyspace_name> <table_name>Recommended to use with
-vfor verbose logging so you can see which partition and cluster is updated. - -v, --verbose
-
Enable verbose console output
Examples
Verify DataStax Enterprise is not running
nodetool status
Datacenter: Graph
================================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.200.177.92 265.04 KiB 1 ? 980cab6a-2e5d-44c6-b897-0733dde580ac rack1
DN 10.200.177.94 426.21 KiB 1 ? 7ecbbc0c-627d-403e-b8cc-a2daa93d9ad3 rack1
Scrub all SSTables for the calendar table
sstablescrub cycling calendar
Scrub only particular SSTables for the calendar table
sstablescrub cycling calendar --sstable-files /var/lib/cassandra/data/cycling/calendar-eebb/ac-1-bti-Data.db \
/var/lib/cassandra/data/cycling/calendar-aacc/ac-2-bti-Data.db