sstablescrub
Scrubs the SSTable for the provided table.
The sstablescrub
utility is an offline version of nodetool scrub.
It attempts to remove the corrupted parts while preserving non-corrupted data.
Because sstablescrub
runs offline, it can correct errors that nodetool scrub
cannot.
If an SSTable cannot be read due to corruption, it will be left on disk.
If scrubbing results in dropping rows, then new SSTables become unrepaired.
However, if no bad rows are detected, then the SSTable keeps its original repairedAt
field, which denotes the time of the repair.
Restriction: Stop DataStax Enterprise before you run this command.
The default location of this SSTable tool depends on the type of installation:
-
Package installations:
/usr/bin/
-
Tarball installations:
<installation_location>/resources/cassandra/tools/bin
Synopsis
sstablescrub
[--debug] [-e <arg>] [-h] [-j <arg>] [-m] [-n] [-r] [-s] [-t <number of days>] [-v]
<keyspace_name> <table_name> [-sstable-files <arg>]
Syntax conventions | Description |
---|---|
UPPERCASE |
Literal keyword. |
Lowercase |
Not literal. |
<`Italics>` |
Variable value. Replace with a valid option or user-defined value. |
|
Optional.
Square brackets ( |
|
Group.
Parentheses ( |
|
Or.
A vertical bar ( |
|
Repeatable.
An ellipsis ( |
|
Single quotation ( |
|
Map collection.
Braces ( |
|
Set, list, map, or tuple.
Angle brackets ( |
|
End CQL statement.
A semicolon ( |
|
Separate the command line options from the command arguments with two hyphens ( |
|
Search CQL only: Single quotation marks ( |
|
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files. |
Definition
The short form and long form parameters are comma-separated.
Command arguments
- --debug
-
Display stack traces.
- -e, --header-fix argument
-
Check SSTable serialization-headers and repair issues. Takes the following arguments:
-
validate-only
Validate serialization-headers only. Do not attempt any repairs and do not continue with the scrub once the validation is complete.
-
validate
Validate serialization-headers and continue with the scrub once the validation is complete. (Default)
-
fix-only
Validate and repair only the serialization-headers. Do not continue with the scrub once serialization-header validation and repairs are complete.
-
fix
Validate and repair serialization-headers and perform a normal scrub. Do not repair and do not continue with the scrub if serialization-header validation encounters errors.
-
off
Do not perform serialization-header validation checks.
-
- -h, --help
-
Display the usage and listing of the commands.
- -j, --jobs
-
Number of sstables to scrub simultaneously. Defaults to the minimum between either the number of available processors and 8.
- keyspace_name
-
Keyspace name. Required.
- -m, --manifest-check
-
Check and repair only the leveled manifest. Do not scrub the SSTables.
- -n, --no-validate
-
Do not validate columns using column validator.
- -r, --reinsert-overflowed-ttl
-
Rewrite rows with overflowed expiration date affected by CASSANDRA-14092 with the maximum supported expiration date of 2038-01-19T03:14:06+00:00. Rows are rewritten with the original timestamp incremented by one millisecond to override/supersede any potential tombstone that might have been generated during compaction of the affected rows. See Recovering expired data caused by TTL year 2038 problem.
- -s, --skip-corrupted
-
Skips corrupt rows in counter tables.
- --sstable-files
-
Instead of processing all SSTables in the default data directories, process only the tables specified via this option. If a single SSTable file, only that SSTable is processed. If a directory is specified, all SSTables within that directory are processed. Snapshots and backups are not supported with this option.
- table_name
-
Table name. Required.
- -t number of days
-
Given a time in number of days from 1 to 1000, examines all deletion times and changes the value of
timestamp
and oflocal-deletion-time
tonow
if any deletion times are at least the specified number of days in the future. All deletion times that extend into the future beyond the given number of days are reset to the current time.Default: 0 - disables the flag
Command-line usage:
sstablescrub -v -t 1 <keyspace_name> <table_name>
The
-v
flag enables verbose logging so that you can see which partition and cluster is updated.This is a destructive operation and should only be used under the guidance of DataStax Support.
The only time to use
-t
is when the system clock on a node is in the future, because that makes the tombstone unpurgeable.
- -v,--verbose
-
Verbose output.
Examples
Verify DataStax Enterprise is not running
nodetool status
Datacenter: Graph
================================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.200.177.92 265.04 KiB 1 ? 980cab6a-2e5d-44c6-b897-0733dde580ac rack1
DN 10.200.177.94 426.21 KiB 1 ? 7ecbbc0c-627d-403e-b8cc-a2daa93d9ad3 rack1
Restriction: Stop DataStax Enterprise before you run this command.
Scrub all SSTables for the calendar table
sstablescrub cycling calendar
Scrub only particular SSTables for the calendar table
sstablescrub cycling calendar --sstable-files /var/lib/cassandra/data/cycling/calendar-eebb/ac-1-bti-Data.db \
/var/lib/cassandra/data/cycling/calendar-aacc/ac-2-bti-Data.db