nodetool compact

Where is the cassandra-env.sh file?

The location of the cassandra-env.sh file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra-env.sh

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/cassandra-env.sh

Forces a major compaction on one or more tables.

Synopsis

nodetool [options] compact [(-et <end_token> | --end-token <end_token>)]
[(-s | --split-output)] [(-st <start_token> | --start-token <start_token>)] [--] [<keyspace> [<tables>...]]
[--user-defined] <relative_path_to_SSTable file>...

Tarball and Installer No-Services path:

<installation_location>/resources/cassandra/bin

Common options

These options apply to all nodetool commands.

Common options
Short Long Description

-h

--host

Hostname or IP address.

-p

--port

Remote JMX agent port number.

-pw

--password

Password.

-pwf

--password-file

Password file path.

-u

--username

Remote JMX agent user name.

--

Separates an option from an argument that could be mistaken for an option.

  • For tarball installations, execute the command from the <installation_location>/bin directory.

  • If a username and password for RMI authentication are set explicitly in the cassandra-env.sh file for the host, then you must specify credentials.

Compact options

The following options are specific to the compact command:

Compact options
Short Long Description

-et token

--end-token token

Specify a token at which the compaction range ends. Requires start token (-st).

-st token

--start-token token

Specify a token at which the compaction range starts. Requires end token (-et).

-s

--split-output

Split output when using STCS to files that are 50%-25%-12.5% and so on of the total size.

  • For STCS, excluding the -s option creates a single large SSTable.

  • For DTCS, using -s has no effect; a single file is still created.

  • nodetool compact -s only accepts keyspace table as an option. The command only works for the entire set of files in the table directory.

  • compact -s cannot be run on single sstable files; it can only be run on entire tables.

keyspace [tables]

Run compaction on an entire keyspace or specified tables; use a space to separate table names.

--user-defined sstable filenames

Run compaction on one or more SSTables. Specify the relative paths and file names.

Description

This command starts the compaction process on tables using SizeTieredCompactionStrategy (STCS), TimeWindowCompactionStrategy (TWCS), or Leveled compaction (LCS):

  • If you do not specify a keyspace or table, a major compaction is run on all keyspaces and tables.

  • If you specify only a keyspace, a major compaction is run on all tables in that keyspace.

  • If you specify one or more tables, a major compaction is run on those tables.

Major compactions may behave differently depending which compaction strategy is used for the affected tables:

  • SizeTieredCompactionStrategy (STCS): The default compaction strategy. This strategy triggers a minor compaction when there are a number of similar sized SSTables on disk as configured by the table subproperty, min_threshold. A minor compaction does not involve all the tables in a keyspace. Also see STCS compaction subproperties.

  • DateTieredCompactionStrategy (DTCS) (deprecated)

  • TimeWindowCompactionStrategy (TWCS) This strategy is an alternative for time series data. TWCS compacts SSTables using a series of time windows. While with a time window, TWCS compacts all SSTables flushed from memory into larger SSTables using STCS. At the end of the time window, all of these SSTables are compacted into a single SSTable. Then the next time window starts and the process repeats. The duration of the time window is the only setting required. See TWCS compaction subproperties. For more information about TWCS, see How is data maintained?.

  • LeveledCompactionStrategy (LCS): The leveled compaction strategy creates SSTables of a fixed, relatively small size (160 MB by default) that are grouped into levels. Within each level, SSTables are guaranteed to be non-overlapping. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. Disk I/O is more uniform and predictable on higher than on lower levels as SSTables are continuously being compacted into progressively larger levels. At each level, row keys are merged into non-overlapping SSTables in the next level. This process can improve performance for reads, because the database can determine which SSTables in each level to check for the existence of row key data. This compaction strategy is modeled after Google’s LevelDB implementation. Also see LCS compaction subproperties.

For more details, see How is data maintained? and Configuring compaction.

A major compaction incurs considerably more disk I/O than minor compactions.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com