Configure compaction

The compaction process merges keys, combines columns, evicts tombstones, consolidates SSTables, and creates a new index in the merged SSTable.

As discussed in How is data maintained?, the compaction process merges keys, combines columns, evicts tombstones, consolidates SSTables, and creates a new index in the merged SSTable.

In the cassandra.yaml file, you configure these global compaction parameters:

The compaction_throughput_mb_per_sec parameter is designed for use with large partitions. The database throttles compaction to this rate across the entire node.

Hyper-Converged Database (HCD) provides a start-up option for test compaction strategies without affecting the production workload.

To configure the compaction strategy property and CQL compaction subproperties, such as the maximum number of SSTables to compact and minimum SSTable size, use the CQL commands CREATE TABLE or ALTER TABLE.

Procedure

  1. Update a table to set the compaction strategy using the ALTER TABLE statement.

    ALTER TABLE users WITH
      compaction = { 'class' :  'LeveledCompactionStrategy'  }
  2. Change the compaction strategy property to SizeTieredCompactionStrategy and specify the minimum number of SSTables to trigger a compaction using the CQL min_threshold attribute.

    ALTER TABLE users
      WITH compaction =
      {'class' : 'SizeTieredCompactionStrategy', 'min_threshold' : 6 }

Results

You can monitor the results of your configuration using compaction metrics, see Compaction metrics.

What’s next

HCD supports extended logging for Compaction. This utility must be configured as part of the table configuration. The extended compaction logs are stored in a separate file. For details, see Enable extended compaction logging.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com