Configure compaction
The compaction process merges keys, combines columns, evicts tombstones, consolidates SSTables, and creates a new index in the merged SSTable.
In the cassandra.yaml file, you can configure the following global compaction parameters:
-
compaction_throughput_mb_per_secThe
compaction_throughput_mb_per_secparameter is designed for use with large partitions. The database throttles compaction to this rate across the entire node.
Hyper-Converged Database (HCD) provides a start-up option to test compaction strategies without affecting your production workloads.
Use CREATE TABLE and ALTER TABLE to configure compaction
To configure the compaction strategy property and CQL compaction subproperties, such as the maximum number of SSTables to compact and minimum SSTable size, use the CQL commands CREATE TABLE or ALTER TABLE.
For example:
-
Change a table’s compaction strategy using
ALTER TABLE:ALTER TABLE users WITH compaction = { 'class' : 'LeveledCompactionStrategy' } -
Change the compaction strategy property to
SizeTieredCompactionStrategy, and specify the minimum number of SSTables to trigger a compaction using the CQLmin_thresholdattribute:ALTER TABLE users WITH compaction = {'class' : 'SizeTieredCompactionStrategy', 'min_threshold' : 6 }
Compaction metrics
You can monitor the results of your configuration using compaction metrics.
The following attributes are exposed through CompactionManagerMBean:
| Attribute | Description |
|---|---|
|
Total number of bytes compacted since server start or restart. |
|
Number of completed compactions since server start or restart. |
|
Estimated number of compactions remaining to perform. |
|
Total number of compactions since server start or restart. |
For more information, see Monitor Hyper-Converged Database (HCD) clusters.
Extended logging
HCD supports extended logging for compaction. This utility must be configured as part of the table configuration. The extended compaction logs are stored in a separate file. For details, see Enable extended compaction logging.