Compression

Configure data compression on a per-table basis to optimize performance of read-dominated tasks.

The default compression for all data in DataStax Enterprise 4.0.4 is LZ4.

The default compression in Cassandra 2.0.5 changed from Snappy to LZ4. DataStax Enterprise 4.0 - 4.0.3 uses LZ4 to compress real-time Cassandra and Solr nodes. The Snappy compressor remains the default for data stored in CassandraFS. For example, if you put a text file on the CassandraFS using the Hadoop shell command, compression is Snappy. If you create a CQL table on an analytics/Hadoop node, the compression is LZ4.

LZ4 is fastest type of compression available in Cassandra, followed by Snappy, then by Deflate. Search nodes typically engage in read-dominated tasks, so maximizing storage capacity of nodes, reducing the volume of data on disk, and limiting disk I/O can improve performance. You can configure data compression on a per-table basis to optimize performance of read-dominated tasks.

You can change compression options using ALTER TABLE. You can configure the compression chunk size for read/write access patterns and the average size of rows in the table.