sstablesplit

Splits SSTable files into multiple SSTables of a maximum designated size while offline.

Splits SSTable files into multiple SSTables of a maximum designated size while offline.

For example, if SizeTieredCompactionStrategy was used for a major compaction and results in an excessively large SSTable, split the table to ensure that compaction occurs before the next huge compaction.

Restriction: Stop DataStax Enterprise before you run this command.
The default location of this SSTable tool depends on the type of installation:
  • Package installations: /usr/bin/
  • Tarball installations: installation_location/resources/cassandra/tools/bin

Synopsis

sstablessplit [--debug] [-h] [--no_snapshot] [-s max_size_in_MB] sstable_filepath [sstable_filepath ...]
Tip: SSTable tools work offline from the DataStax Enterprise database. To pass a JVM parameter, specify it in the command line. For example, to change the max heap size:
MAX_HEAP=2g sstabletoolname
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a valid option or user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key:value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

Definition

The short form and long form parameters are comma-separated.

Command arguments

--debug
Display stack traces.
-h, --help
Display the usage and listing of the commands.
--no-snapshot
Do not snapshot SSTables before splitting.
-s, --size max_size_in_MB
Maximum size in MB for output SSTables. Default: 50.
sstable_filepath
Filepath to an SSTable.

Examples

Verify DataStax Enterprise is not running

nodetool status
Datacenter: Graph
================================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns    Host ID                               Rack
UN  10.200.177.92  265.04 KiB  1            ?       980cab6a-2e5d-44c6-b897-0733dde580ac  rack1
DN  10.200.177.94  426.21 KiB  1            ?       7ecbbc0c-627d-403e-b8cc-a2daa93d9ad3  rack1
Restriction: Stop DataStax Enterprise before you run this command.

Split SSTables to 10 MB

sstablesplit /var/lib/cassandra/data/cycling/cyclist_category-e1f76e21ce4311e8949e33016bf887c0/aa-1-bti-Statistics.db 10
Skipping inexisting file 10
        Skipping /var/lib/cassandra/data/cycling/cyclist_category-e1f76e21ce4311e8949e33016bf887c0/aa-1-bti-Data.db: it's size (0.000 MB) is less than the split size (50 MB)
        No sstables needed splitting.