Table attributes
Attributes can be declared per table.
The following attributes can be declared per table.
Option | Default value |
---|---|
bloom_filter_fp_chance | 0.01 or 0.1 (Value depends on the compaction strategy.) |
bucket_high | 1.5 |
bucket_low | 0.5 |
caching | keys_only |
column_metadata | N/A (container attribute) |
column_type | Standard |
comment | N/A |
compaction_strategy | SizeTieredCompactionStrategy |
compaction_strategy_options | N/A (container attribute) |
comparator | BytesType |
compare_subcolumns_with | BytesType* |
compression_options | sstable_compression='SnappyCompressor' |
default_validation_class | N/A |
dclocal_read_repair_chance | 0.0 |
gc_grace | 864000 (10 days) |
key_validation_class | N/A |
max_compaction_threshold | 32 |
min_compaction_threshold | 4 |
memtable_flush_after_mins | N/A* |
memtable_operations_in_millions | N/A* |
memtable_throughput_in_mb | N/A* |
min_sstable_size | 50MB |
name | N/A |
populate_io_cache_on_flush | False |
read_repair_chance | 0.1or 1 (See description below.) |
replicate_on_write | true |
sstable_size_in_mb | 160MB |
tombstone_compaction_interval | 86400 seconds [1 day] |
tombstone_threshold | 0.2 |
* Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility.
- compaction_strategy_options
- (Default: N/A - container attribute) Sets attributes related to the chosen compaction-strategy. Attributes are:
- bloom_filter_fp_chance
- (Default: 0.01 for SizeTieredCompactionStrategy, 0.1 for LeveledCompactionStrategy) Desired false-positive probability for SSTable Bloom filters. When data is requested, the Bloom filter checks if the requested row exists before doing any disk I/O. Valid values are 0 to 1.0. A setting of 0 means that the unmodified (effectively the largest possible) Bloom filter is enabled. Setting the Bloom Filter at 1.0 disables it. The higher the setting, the less memory Cassandra uses. The maximum recommended setting is 0.1, as anything above this value yields diminishing returns. For detailed information, see Tuning Bloom filters.
- bucket_high
- (Default: 1.5) Size-tiered compaction considers SSTables to be within the same bucket if the SSTable size diverges by 50% or less from the default bucket_low and default bucket_high values: [average-size × bucket_low, average-size × bucket_high].
- bucket_low
- (Default: 0.5) See bucket_high for a description.
- caching
- (Default: keys_only) Optimizes the use of cache memory without manual
tuning. Set caching to one of the following values:
- all
- keys_only
- rows_only
- none
Cassandra weights the cached data by size and access frequency. Use this parameter to specify a key or row cache instead of a table cache, as in earlier versions.
- chunk_length_kb
- (Default: 64KB) On disk SSTables are compressed by block (to allow random reads). This subproperty of compression defines the size (in KB) of the block. Values larger than the default value might improve the compression rate, but increases the minimum size of data to be read from disk when a read occurs. The default value (64) is a good middle-ground for compressing tables. Adjust compression size to account for read/write access patterns (how much data is typically requested at once) and the average size of rows in the table.
- column_metadata
- (Default: N/A - container attribute) Column metadata defines these attributes of a
column:
- name: Binds a validation_class and (optionally) an index to a column.
- validation_class: Type used to check the column value.
- index_name: Name of the index.
- index_type: Type of index. Currently the only supported value is KEYS.
Setting a value for the name option is required. The validation_class is set to the default_validation_class of the table if you do not set the validation_class option explicitly. The value of index_type must be set to create an index for a column. The value of index_name is not valid unless index_type is also set.
Setting and updating column metadata with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:
[default@demo ] UPDATE COLUMN FAMILY users WITH comparator =UTF8Type AND column_metadata =[{column_name: full_name, validation_class: UTF8Type, index_type: KEYS }];
- column_type
- (Default: Standard) The standard type of table contains regular columns.
- comment
- (Default: N/A) A human readable comment describing the table.
- compaction_strategy
- (Default: SizeTieredCompactionStrategy) Sets the compaction
strategy for the table. The available strategies are:
- SizeTieredCompactionStrategy: The default compaction strategy and the only compaction strategy available in releases earlier than Cassandra 1.0. This strategy triggers a minor compaction whenever there are a number of similar sized SSTables on disk (as configured by min_compaction_threshold). Using this strategy causes bursts in I/O activity while a compaction is in process, followed by longer and longer lulls in compaction activity as SSTable files grow larger in size. These I/O bursts can negatively effect read-heavy workloads, but typically do not impact write performance. Watching disk capacity is also important when using this strategy, as compactions can temporarily double the size of SSTables for a table while a compaction is in progress.
- LeveledCompactionStrategy: The leveled compaction strategy creates SSTables of a fixed, relatively small size (5 MB by default) that are grouped into levels. Within each level, SSTables are guaranteed to be non-overlapping. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. Disk I/O is more uniform and predictable as SSTables are continuously being compacted into progressively larger levels. At each level, row keys are merged into non-overlapping SSTables. This can improve performance for reads, because Cassandra can determine which SSTables in each level to check for the existence of row key data. This compaction strategy is modeled after Google's leveldb implementation. For more information, see the articles When to Use Leveled Compaction and Leveled Compaction in Apache Cassandra.
- comparator
- (Default: BytesType) Defines the data types used to validate and sort column names. There are several built-in column comparators available. The comparator cannot be changed after you create a table.
- compare_subcolumns_with
- (Default: BytesType) Required when the column_type attribute is set to Super. Same as comparator but for the sub-columns of a super column. Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility.
- compression_options
- (Default: N/A - container attribute) Sets the compression algorithm and subproperties
for the table. Choices are:
- sstable_compression
- chunk_length_kb
- crc_check_chance
- crc_check_chance
- (Default 1.0) When compression is enabled, each compressed block includes a checksum of that block for the purpose of detecting disk bitrot and avoiding the propagation of corruption to other replica. This option defines the probability with which those checksums are checked during read. By default they are always checked. Set to 0 to disable checksum checking and to 0.5, for instance, to check them on every other read.
- default_validation_class
- (Default: N/A) Defines the data type used to validate column values. There are several built-in column validators available.
- dclocal_read_repair_chance
- (Default: 0.0) Specifies the probability of read repairs being invoked over all replicas in the current data center. Contrast read_repair_chance.
- gc_grace
- (Default: 864000 [10 days]) Specifies the time to wait before garbage collecting tombstones (deletion markers). The default value allows a great deal of time for consistency to be achieved prior to deletion. In many deployments this interval can be reduced, and in a single-node cluster it can be safely set to zero.
- key_validation_class
- (Default: N/A) Defines the data type used to validate row key values. There are several built-in key validators available, however CounterColumnType (distributed counters) cannot be used as a row key validator.
- max_compaction_threshold
- (Default: 32) In SizeTieredCompactionStrategy sets the maximum number of SSTables processed by a minor compaction.
- min_compaction_threshold
- (Default: 4) In SizeTieredCompactionStrategy sets the minimum number of SSTables to trigger a minor compaction.
- memtable_flush_after_mins
- Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility. Use commitlog_total_space_in_mb.
- memtable_operations_in_millions
- Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility. Use commitlog_total_space_in_mb.
- memtable_throughput_in_mb
- Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility. Use commitlog_total_space_in_mb.
- min_sstable_size
- (Default: 50MB) The SizeTieredCompactionStrategy groups SSTables for compaction into buckets. The bucketing process groups SSTables that differ in size by less than 50%. This results in a bucketing process that is too fine grained for small SSTables. If your SSTables are small, use min_sstable_size to define a size threshold (in bytes) below which all SSTables belong to one unique bucket.
- populate_io_cache_on_flush
- (Default: false) Adds newly flushed or compacted sstables to the operating system page cache, potentially evicting other cached data to make room. Enable when all data in the table is expected to fit in memory. See also the global option, compaction_preheat_key_cache.
- name
- (Default: N/A) Required. The user-defined name of the table.
- read_repair_chance
- (Default: 0.1 or 1) Specifies the probability with which read repairs should be invoked on non-quorum reads. The value must be between 0 and 1. For tables created in versions of Cassandra before 1.0, it defaults to 1. For tables created in versions of Cassandra 1.0 and higher, it defaults to 0.1. However, for Cassandra 1.0, the default is 1.0 if you use CLI or any Thrift client, such as Hector or pycassa, and is 0.1 if you use CQL.
- replicate_on_write
- (Default: true) Applies only to counter tables. When set to true, replicates writes to all affected replicas regardless of the consistency level specified by the client for a write request. For counter tables, this should always be set to true.
- sstable_size_in_mb
- (Default: 160MB) The target size for SSTables that use the leveled compaction strategy. Although SSTable sizes should be less or equal to sstable_size_in_mb, it is possible to have a larger SSTable during compaction. This occurs when data for a given partition key is exceptionally large. The data is not split into two SSTables.
- sstable_compression
- (Default: SnappyCompressor) The compression algorithm to use. Valid values are LZ4Compressor available in Cassandra 1.2.2 and later), SnappyCompressor, and DeflateCompressor. Use an empty string ('') to disable compression. Choosing the right compressor depends on your requirements for space savings over read performance. LZ4 is fastest to decompress, followed by Snappy, then by Deflate. Compression effectiveness is inversely correlated with decompression speed. The extra compression from Deflate or Snappy is not enough to make up for the decreased performance for general-purpose workloads, but for archival data they may be worth considering. Developers can also implement custom compression classes using the org.apache.cassandra.io.compress.ICompressor interface. Specify the full class name as a "string constant".
- tombstone_compaction_interval
- (Default: 86400 seconds [1 day]) The minimum time to wait after an
SSTable creation time before considering the SSTable for tombstone compaction. Tombstone
compaction is the compaction triggered if the SSTable has more garbage-collectable
tombstones than tombstone_threshold.Note: Cassandra will perform extra compactions when the amount of tombstones in a data file exceeds tombstone_threshold. The data file will be compacted by itself, and tombstones that are no longer needed are discarded. However, if data for the tombstone's partition exists in other data files, the tombstone cannot be discarded because it may be needed to indicate that data is deleted. The tombstone_compaction_interval represents how soon Cassandra allows retrying a tombstone compaction for a given data file. Therefore low values may result in repeated ineffective compaction attempts until the tombstone partition is merged with the other data files by a normal compaction event.
- tombstone_threshold
- (Default: 0.2) A ratio of garbage-collectable tombstones to all contained columns, which if exceeded by the SSTable triggers compaction (with no other SSTables) for the purpose of purging the tombstones.