Compaction sub-properties
Configure by specifying the compaction algorithm class
followed by the sub-properties in simple JSON format.
Use only compaction implementations bundled with CQL.
See How is data maintained? for more details.
compaction = {
'class' : '<compaction_strategy_name>',
'<property_name>' : <value> [, ...] }
where the <compaction_strategy_name> is SizeTieredCompactionStrategy, TimeWindowCompactionStrategy, or LeveledCompactionStrategy.
Common properties
The following properties apply to all compaction strategies.
compaction = {
'class' : 'compaction_strategy_name',
'enabled' : (true | false),
'log_all' : (true | false),
'only_purge_repaired_tombstone' : (true | false),
'tombstone_threshold' : <ratio>,
'tombstone_compaction_interval' : <sec>,
'unchecked_tombstone_compaction' : (true | false),
'min_threshold' : <num_sstables>,
'max_threshold' : <num_sstables> }
Property | Description | Default |
---|---|---|
|
Enable background compaction.
Use |
|
|
Activates advanced logging for the entire cluster. |
|
|
Enabling this property prevents data from resurrecting when repair is not run within the
|
|
|
The ratio of garbage-collectable tombstones to all contained columns. If the ratio exceeds this limit, compactions starts only on that table to purge the tombstones. |
|
|
Number of seconds before compaction can run on an SSTable after it is created.
An SSTable is eligible for compaction when it exceeds the |
|
|
Setting to |
|
|
The minimum number of SSTables to trigger a minor compaction.
Restriction: Not used in |
|
|
The maximum number of SSTables before a minor compaction is triggered.
Restriction: Not used in |
|
SizeTieredCompactionStrategy
The compaction class SizeTieredCompactionStrategy
(STCS) is the default compaction strategy.
The compaction class SizeTieredCompactionStrategy
(STCS) triggers a minor compaction when table meets the min_threshold
.
Minor compactions do not involve all the tables in a keyspace.
See SizeTieredCompactionStrategy (STCS).
The following properties only apply to SizeTieredCompactionStrategy:
compaction = {
'class' : 'SizeTieredCompactionStrategy',
'bucket_high' : <factor>,
'bucket_low' : <factor>,
'min_sstable_size' : <int> }
Property | Description | Default |
---|---|---|
|
Size-tiered compaction merges sets of SSTables that are approximately the same size. The database compares each SSTable size to the average of all SSTable sizes for this table on the node. It merges SSTables whose size in KB are within [average-size * bucket_low] and [average-size * bucket_high]. |
|
|
Size-tiered compaction merges sets of SSTables that are approximately the same size. The database compares each SSTable size to the average of all SSTable sizes for this table on the node. It merges SSTables whose size in KB are within [average-size * bucket_low] and [average-size * bucket_high]. |
|
|
STCS groups SSTables into buckets. The bucketing process groups SSTables that differ in size by less than 50%. This bucketing process is too fine-grained for small SSTables. If your SSTables are small, use this option to define a size threshold in MB below which all SSTables belong to one unique bucket. |
|
The |
TimeWindowCompactionStrategy
The compaction class TimeWindowCompactionStrategy
(TWCS) compacts SSTables using a series of time windows or buckets.
TWCS creates a new time window within each successive time period.
During the active time window, TWCS compacts all SSTables flushed from memory into larger SSTables using STCS.
At the end of the time period, all of these SSTables are compacted into a single SSTable.
Then the next time window starts and the process repeats.
See TimeWindowCompactionStrategy (TWCS).
All of the properties for STCS are also valid for TWCS. |
The following properties apply only to TimeWindowCompactionStrategy:
compaction = {
'class' : 'TimeWindowCompactionStrategy,
'compaction_window_unit' : <days>,
'compaction_window_size' : <int>,
'split_during_flush' : (true | false) }
Property | Description | Default |
---|---|---|
|
Time unit used to define the bucket size.
The value is based on the Java |
|
|
Units per bucket. |
|
|
Prevents mixing older data from repairs and hints with newer data from the current time window. During a flush operation, determines whether data partitions are split based on the configured time window.
|
|
During the flush operation, the data is split into a maximum of 12 windows. Each window holds the data in a separate SSTable. If the current time is <t0> and each window has a time duration of <w>, the data is split in the SSTables as follows:
-
SSTable 0 contains data for the time period < <t0> - 10 * <w>
-
SSTables 1 to 10 contain data for the 10 equal time periods from (<t0> - 10 * <w>) through to (<t0> - 1 * <w>)
-
SSTable 11, the 12th table, contains data for the time period > <t0>
LeveledCompactionStrategy
The compaction class LeveledCompactionStrategy
(LCS) creates SSTables of a fixed, relatively small size (160 MB by default) that are grouped into levels.
Within each level, SSTables are guaranteed to be non-overlapping.
Each level (L0, L1, L2 and so on) is 10 times as large as the previous.
Disk I/O is more uniform and predictable on higher than on lower levels as SSTables are continuously being compacted into progressively larger levels.
At each level, row keys are merged into non-overlapping SSTables in the next level.
See LeveledCompactionStrategy (LCS).
For more guidance, see When to Use Leveled Compaction and Leveled Compaction blog. |
The following properties only apply to LeveledCompactionStrategy:
compaction = {
'class' : 'LeveledCompactionStrategy,
'sstable_size_in_mb' : <int> }
Property | Description | Default |
---|---|---|
|
The target size for SSTables that use the LeveledCompactionStrategy. Although SSTable sizes should be less or equal to sstable_size_in_mb, it is possible that compaction could produce a larger SSTable during compaction. This occurs when data for a given partition key is exceptionally large. The DSE database does not split the data into two SSTables. |
|
The default value, 160 MB, may be inefficient and negatively impact database indexing and the queries that rely on indexes. For example, consider the benefit of using higher values for sstable_size_in_mb in tables that use (SAI) indexes. For related information, see Compaction strategies. |
DateTieredCompactionStrategy (deprecated)
Use TimeWindowCompactionStrategy instead.
Stores data written within a certain period of time in the same SSTable.
Property | Description | Default |
---|---|---|
|
The size of the first time window. |
|
|
DSE does not compact SSTables if its most recent data is older than this property. Fractional days can be set. |
|
|
The maximum window size in seconds. |
|
|
Units, <MICROSECONDS> or <MILLISECONDS>, to match the timestamp of inserted data. |
|