sstablepartitions

Identifies large partitions of SSTables.

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

Identifies large partitions of SSTables and outputs the partition size in bytes, row count, cell count, and tombstone count.

Synopsis

sstablepartitions 
[-b] [-c cell_threshold] 
[-k partition_key] 
[-m] [-o tombstone_count_threshold] 
[-r] [-t partition_count_threshold]
[-u] [-x partition_keys | -y]
sstable_filepath | sstable_directory
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a valid option or user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key:value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

The short form and long form parameters are comma-separated.

Command arguments

-b, --backups
Include backups in the data directories (recursive scans).
-c, --min-cells cell_threshold
Partition cell count threshold.
-k, --key partition_key
Partition keys to include.
-m, --csv
Produce CSV machine-readable output instead of JSON formatted output.
-o, --min-tombstones tombstone_threshold
Partition tombstone count threshold.
-r, --recursive
Recursively.
sstable_directory
The absolute path to the SSTable data directory. The data_file_directories property in cassandra.yaml defines the default directory.
sstable_filepath
The explicit or relative filepath to the SSTable data file ending in Data.db.
-t, --min-size partition_threshold
Partition size threshold in bytes.
-u, --current-timestamp
Include timestamp in output. Timestamp is the number of seconds since epoch, unit time for TTL expired calculation.
-x, --exclude-key partition_key
Partition key to exclude. Ignored if -y option is given.
-y, --partitions-only
Only brief partition information. Exclude per-partition detailed row/cell/tombstone information from process and output.

Examples

Analyze partition statistics for all SSTables a single table

sstablepartitions -r /var/lib/cassandra/data/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/
Processing stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #3 (bti-aa) (6445137 bytes uncompressed, 5416338 bytes on disk)
Partition size            Row count           Cell count      Tombstone count
p50                    124                    1                    1                    1
p75                    149                    1                    1                    1
p90                    149                    2                    2                    1
p95                    179                    2                    2                    1
p99                    215                    3                    3                    1
p999                   258                    4                    4                    1
min                     51                    0                    0                    0
max                   8239                  179                  179                    1
count                56696
time                137676
        
Processing stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #4 (bti-aa) (230134 bytes uncompressed, 192999 bytes on disk)
Partition size            Row count           Cell count      Tombstone count
p50                    124                    1                    1                    1
p75                    124                    1                    1                    1
p90                    149                    1                    1                    1
p95                    149                    1                    1                    1
p99                    149                    1                    1                    1
p999                   179                    2                    2                    1
min                     51                    0                    0                    0
max                    446                   10                   10                    1
count                 2169
time                  3626
Note: The unit of measure for the partition size column is bytes.

Output only partitions with cell count threshold equal to or greater than 10

sstablepartitions -c 10 /var/lib/cassandra/data/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db
Processing stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #4 (bti-aa) (230134 bytes uncompressed, 192999 bytes on disk)
Partition: 'Fwl
Cc	xD06iw_]Q|[t[KzCI&	$' (46776c0b4363097815114430361169775f7f5d511b3b08177c5b745b4b1306007a434926091a24) live, position: 208502, size: 434, rows: 10, cells: 10, tombstones: 0 (row:0, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Summary of stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #4 (bti-aa):
File: /home/dimitarndimitrov/.ccm/c13529-master/node1/data0/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db
1 partitions match
Keys: Fwl
Cc	xD06iw_]Q|[t[KzCI&	$
Partition size            Row count           Cell count      Tombstone count
p50                    124                    1                    1                    1
p75                    124                    1                    1                    1
p90                    149                    1                    1                    1
p95                    149                    1                    1                    1
p99                    149                    1                    1                    1
p999                   179                    2                    2                    1
min                     51                    0                    0                    0
max                    446                   10                   10                    1
count                 2169
time                  4875
Note: The unit of measure for the partition size column is bytes.

Output CSV machine-readable output

sstablepartitions -c 10 -m /var/lib/cassandra/data/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db
key,keyBinary,live,offset,size,rowCount,cellCount,tombstoneCount,rowTombstoneCount,rangeTombstoneCount,complexTombstoneCount,cellTombstoneCount,rowTtlExpired,cellTtlExpired,directory,keyspace,table,index,snapshot,backup,generation,format,version
"Fwl
Cc	xD06iw_]Q|[t[KzCI&	$",46776c0b4363097815114430361169775f7f5d511b3b08177c5b745b4b1306007a434926091a24,true,208502,434,10,10,0,0,0,0,0,0,0,/home/dimitarndimitrov/.ccm/c13529-master/node1/data0/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db,stresscql,blogposts,,,,4,bti,aa