sstablepartitions

Identifies large partitions of SSTables and outputs the partition size in bytes, row count, cell count, and tombstone count.

The default location of this SSTable tool depends on the type of installation:

  • Package installations: /usr/bin/

  • Tarball installations: <installation_location>/resources/cassandra/tools/bin

Synopsis

sstablepartitions
[-b] [-c <cell_threshold>]
[-k <partition_key>]
[-m] [-o <tombstone_count_threshold>]
[-r] [-t <partition_count_threshold>]
[-u] [-x <partition_key> | -y]
<sstable_filepath> | <sstable_directory>
Syntax conventions Description

UPPERCASE

Literal keyword.

Lowercase

Not literal.

<`Italics>`

Variable value. Replace with a valid option or user-defined value.

[ ]

Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.

( )

Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.

|

Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.

...

Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.

'<Literal string>'

Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.

{ <key>:<value> }

Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.

<<datatype1>,<datatype2>>

Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.

cql_statement;

End CQL statement. A semicolon ( ; ) terminates all CQL statements.

[ -- ]

Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.

' <<schema> ... </schema> >'

Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.

@<xml_entity>='<xml_entity_type>'

Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

The short form and long form parameters are comma-separated.

Command arguments

-b, --backups

Include backups in the data directories (recursive scans).

-c, --min-cells cell_threshold

Partition cell count threshold.

-k, --key partition_key

Partition key to include.

-m, --csv

Produce CSV machine-readable output instead of JSON formatted output.

-o, --min-tombstones tombstone_threshold

Partition tombstone count threshold.

-r, --recursive

Recursively.

sstable_directory

The filepath to the SSTable data directory. The data_file_directories property in cassandra.yaml defines the default directory.

sstable_filepath

The explicit or relative filepath to the SSTable data file ending in Data.db.

-t, --min-size partition_threshold

Partition size threshold in bytes.

-u, --current-timestamp

Include timestamp in output. Timestamp is the number of seconds since epoch, unit time for TTL expired calculation.

-x, --exclude-key partition_key

Partition key to exclude. Ignored if -y option is given.

-y, --partitions-only

Only brief partition information. Exclude per-partition detailed row/cell/tombstone information from process and output.

Examples

Analyze partition statistics for all SSTables a single table

sstablepartitions -r /var/lib/cassandra/data/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/
Processing stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #3 (bti-aa) (6445137 bytes uncompressed, 5416338 bytes on disk)
Partition size            Row count           Cell count      Tombstone count
p50                    124                    1                    1                    1
p75                    149                    1                    1                    1
p90                    149                    2                    2                    1
p95                    179                    2                    2                    1
p99                    215                    3                    3                    1
p999                   258                    4                    4                    1
min                     51                    0                    0                    0
max                   8239                  179                  179                    1
count                56696
time                137676

Processing stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #4 (bti-aa) (230134 bytes uncompressed, 192999 bytes on disk)
Partition size            Row count           Cell count      Tombstone count
p50                    124                    1                    1                    1
p75                    124                    1                    1                    1
p90                    149                    1                    1                    1
p95                    149                    1                    1                    1
p99                    149                    1                    1                    1
p999                   179                    2                    2                    1
min                     51                    0                    0                    0
max                    446                   10                   10                    1
count                 2169
time                  3626

The unit of measure for the partition size column is bytes.

Output only partitions with cell count threshold equal to or greater than 10

sstablepartitions -c 10 /var/lib/cassandra/data/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db
Processing stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #4 (bti-aa) (230134 bytes uncompressed, 192999 bytes on disk)
Partition: 'Fwl
Cc	xD06iw_]Q|[t[KzCI&	$' (46776c0b4363097815114430361169775f7f5d511b3b08177c5b745b4b1306007a434926091a24) live, position: 208502, size: 434, rows: 10, cells: 10, tombstones: 0 (row:0, range:0, complex:0, cell:0, row-TTLd:0, cell-TTLd:0)
Summary of stresscql.blogposts-7dd6dfc289b511e8a4a329556a9391cc #4 (bti-aa):
File: /home/dimitarndimitrov/.ccm/c13529-master/node1/data0/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db
1 partitions match
Keys: Fwl
Cc	xD06iw_]Q|[t[KzCI&	$
Partition size            Row count           Cell count      Tombstone count
p50                    124                    1                    1                    1
p75                    124                    1                    1                    1
p90                    149                    1                    1                    1
p95                    149                    1                    1                    1
p99                    149                    1                    1                    1
p999                   179                    2                    2                    1
min                     51                    0                    0                    0
max                    446                   10                   10                    1
count                 2169
time                  4875

The unit of measure for the partition size column is bytes.

Output CSV machine-readable output

sstablepartitions -c 10 -m /var/lib/cassandra/data/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db
key,keyBinary,live,offset,size,rowCount,cellCount,tombstoneCount,rowTombstoneCount,rangeTombstoneCount,complexTombstoneCount,cellTombstoneCount,rowTtlExpired,cellTtlExpired,directory,keyspace,table,index,snapshot,backup,generation,format,version
"Fwl
Cc	xD06iw_]Q|[t[KzCI&	$",46776c0b4363097815114430361169775f7f5d511b3b08177c5b745b4b1306007a434926091a24,true,208502,434,10,10,0,0,0,0,0,0,0,/home/dimitarndimitrov/.ccm/c13529-master/node1/data0/stresscql/blogposts-7dd6dfc289b511e8a4a329556a9391cc/aa-4-bti-Data.db,stresscql,blogposts,,,,4,bti,aa

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com