sstablescrub
Scrubs the SSTable for the provided table.
cassandra.yaml
The location of the cassandra.yaml file depends on the type of installation:Package installations | /etc/dse/cassandra/cassandra.yaml |
Tarball installations | installation_location/resources/cassandra/conf/cassandra.yaml |
Scrubs the SSTable for the provided table.
The sstablescrub utility is an offline version of nodetool
scrub. It attempts to remove the corrupted parts while preserving non-corrupted
data. Because sstablescrub runs offline, it can correct errors that nodetool
scrub
cannot. If an SSTable cannot be read due to corruption, it will be left on
disk.
If scrubbing results in dropping rows, new SSTables become unrepaired. However, if no bad
rows are detected, the SSTable keeps its original repairedAt
field, which
denotes the time of the repair.
Synopsis
sstablescrub [--debug] [-e arg] [-h] [-j arg] [-m] [-n] [-r] [-s] [-v] keyspace_name table_name
Syntax conventions | Description |
---|---|
UPPERCASE | Literal keyword. |
Lowercase | Not literal. |
Italics |
Variable value. Replace with a valid option or user-defined value. |
[ ] |
Optional. Square brackets ( [ ] ) surround optional command
arguments. Do not type the square brackets. |
( ) |
Group. Parentheses ( ( ) ) identify a group to choose from. Do
not type the parentheses. |
| |
Or. A vertical bar ( | ) separates alternative elements. Type
any one of the elements. Do not type the vertical bar. |
... |
Repeatable. An ellipsis ( ... ) indicates that you can repeat
the syntax element as often as required. |
'Literal string' |
Single quotation ( ' ) marks must surround literal strings in
CQL statements. Use single quotation marks to preserve upper case. |
{ key:value } |
Map collection. Braces ( { } ) enclose map collections or key
value pairs. A colon separates the key and the value. |
<datatype1,datatype2> |
Set, list, map, or tuple. Angle brackets ( < > ) enclose
data types in a set, list, map, or tuple. Separate the data types with a comma.
|
cql_statement; |
End CQL statement. A semicolon ( ; ) terminates all CQL
statements. |
[ -- ] |
Separate the command line options from the command arguments with two hyphens (
-- ). This syntax is useful when arguments might be mistaken for
command line options. |
' <schema> ... </schema>
' |
Search CQL only: Single quotation marks ( ' ) surround an entire
XML schema declaration. |
@xml_entity='xml_entity_type' |
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files. |
Definition
The short form and long form parameters are comma-separated.
Command arguments
- --debug
- Display stack traces.
- -e, --header-fix argument
- Check SSTable serialization-headers and repair issues. Takes the following arguments:
- validate-only
- Validate serialization-headers only. Do not attempt any repairs and do not continue with the scrub once the validation is complete.
- validate
- Validate serialization-headers and continue with the scrub once the validation is complete. (Default)
- fix-only
- Validate and repair only the serialization-headers. Do not continue with the scrub once serialization-header validation and repairs are complete.
- fix
- Validate and repair serialization-headers and perform a normal scrub. Do not repair and do not continue with the scrub if serialization-header validation encounters errors.
- off
- Do not perform serialization-header validation checks.
- -h, --help
- Display the usage and listing of the commands.
- -j, --jobs
-
Number of sstables to scrub simultaneously. Defaults to the minimum between either the number of available processors and 8.
- keyspace_name
- Keyspace name. Required. Overrides the client_encryption_options in cassandra.yaml.
- -m, --manifest-check
- Check and repair only the leveled manifest. Do not scrub the SSTables.
- -n, --no-validate
- Do not validate columns using column validator.
- -r, --reinsert-overflowed-ttl
- Rewrite rows with overflowed expiration date affected by CASSANDRA-14092 with the maximum supported expiration date of 2038-01-19T03:14:06+00:00. Rows are rewritten with the original timestamp incremented by one millisecond to override/supersede any potential tombstone that might have been generated during compaction of the affected rows. See /en/dse-trblshoot/doc/troubleshooting/recoveringTtlYear2038Problem.html.
- -s, --skip-corrupted
- Skips corrupt rows in counter tables.
- table_name
- Table name. Required.
- -v,--verbose
- Verbose output.
Examples
Verify DataStax Enterprise is not running
nodetool status
Datacenter: Graph
================================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.200.177.92 265.04 KiB 1 ? 980cab6a-2e5d-44c6-b897-0733dde580ac rack1
DN 10.200.177.94 426.21 KiB 1 ? 7ecbbc0c-627d-403e-b8cc-a2daa93d9ad3 rack1
Scrub all SSTables for the calendar table
sstablescrub cycling calendar
Scrub only particular SSTables for the calendar table
sstablescrub cycling calendar --stable-files /var/lib/cassandra/data/cycling/calendar-eebb/ac-1-bti-Data.db \ /var/lib/cassandra/data/cycling/calendar-aacc/ac-2-bti-Data.db