`cassandra.yaml` configuration file

The cassandra.yaml file is the main configuration file for Hyper-Converged Database (HCD).

After changing properties in the cassandra.yaml file, you must restart the node for the changes to take effect (with a few exceptions).

Syntax

For the properties in each section, the parent setting has zero spaces. Each child entry requires at least two spaces. Adhere to the YAML syntax and retain the spacing.

Default values that are not defined are shown as Default: none.

Internally defined default values are described.

Default values can be defined internally, commented out, or have implementation dependencies on other properties in the cassandra.yaml file. Additionally, some commented-out values may not match the actual default values. The commented out values are recommended alternatives to the default values.

Organization

The configuration properties are grouped into the following sections:

Quick start: The minimal properties needed for configuring a cluster.
Default directories: Modify these if you changed any default directories during installation.
Data directory configuration: Properties for configuring the location of a single or multiple (JBOD) data directories.
Commonly used properties: The most frequently used properties when configuring HCD.
Performance tuning: Tuning performance and system resource utilization, including commit log, compaction, memory, disk I/O, CPU, reads, and writes.
Advanced properties: Properties for advanced users or properties that are less commonly used.
Security properties: Configure authentication, authorization, and role management.
User-defined functions (UDF) properties: Configure how UDF code is executed inside Cassandra daemons.
Continuous paging options: Configure memory, threads, and duration when pushing pages continuously to the client.
Memory leak detection settings: Configure memory leak detection.
DataStax Astra emulation: Enable emulation mode for testing applications meant to run on Astra DB.
Guardrails: Configure Cassandra system limits which ensure high availability and optimal performance of the database.

Quick start properties

These are the minimal properties needed for configuring a cluster:

cluster_name: The name of the cluster. This setting prevents nodes in one logical cluster from joining another so it is important to set it to a unique name other than the default. All nodes in a cluster must have the same value.

Default: 'Test Cluster'

rpc_address: The address that client applications connect to. This is typically set to a node’s public IP that is routable from the clients.

If not changed from the default localhost, only applications deployed on the server will be able to connect to the node.

Default: localhost

listen_address: The IP address or hostname that the database binds to, exclusively for private communication between nodes in the cluster. This is typically set to a node’s private IP that is routable from other nodes.

If not changed from the default localhost, the node will not be able to communicate with other nodes in the cluster.

Default: localhost

listen_interface

The interface that the database binds to for connecting to other nodes. Interfaces must correspond to a single address. IP aliasing is not supported.

Never set listen_address to 0.0.0.0.

Set listen_address or listen_interface, not both.

listen_interface_prefer_ipv6

Use IPv4 or IPv6 when interface is specified by name.

false - Use first IPv4 address.
true - Use first IPv6 address.

When only a single address is used, that address is selected without regard to this setting.

Default: false

Default directories

If you have changed any of the default directories during installation, set these properties to the new locations. Make sure you have root access.

data_file_directories

The directory where table data is stored on disk. The database distributes data evenly across the location, subject to the granularity of the configured compaction strategy.

For production, DataStax recommends RAID 0 and SSDs.

Default: /var/lib/cassandra/data

commitlog_directory

The directory where the commit log is stored.

For optimal write performance, place the commit log on a separate disk partition, or ideally on a separate physical device, from the data directories. Because the commit log is append only, a hard disk drive (HDD) is acceptable as long as it is fast enough to keep up with the writes.

Default: $CASSANDRA_HOME/data/commitlog

The commitlog_directory and the cdc_raw_directory must reside on the same partition. Keep these directories in separate sub-folders that are not nested.

cdc_raw_directory

The directory where the change data capture (CDC) commit log segments are stored on flush. DataStax recommends a physical device that is separate from the data directories. See Change Data Capture (CDC) logging.

Default: $CASSANDRA_HOME/data/cdc_raw

The cdc_raw_directory and the commitlog_directory must reside on the same partition. Keep these directories in separate sub-folders that are not nested.

hints_directory: The directory where hints (missed writes) are stored.

Default: $CASSANDRA_HOME/data/hints

metadata_directory: The directory that holds cluster metadata including information about the local node and its peers.

Default: $CASSANDRA_HOME/data/metadata

saved_caches_directory: The directory location where table key and row caches are stored.

Default: $CASSANDRA_HOME/data/saved_caches

Data directory configuration

Distributing data across multiple disks, also known as "JBOD configuration" (just a bunch of disks), maximizes throughput and ensures efficient disk I/O. Cassandra allows you to specify multiple directories for storing your data to achieve this.

To configure a single data directory in the cassandra.yaml file:

data_file_directories:
     - /var/lib/cassandra/data

For multiple data directories:

data_file_directories:
     - /disk1/datadir
     - /disk2/datadir
     - /disk3/datadir

Commonly used properties

These properties are most frequently used when configuring HCD.

Before starting a node for the first time, DataStax recommends that you carefully evaluate your requirements.

Common initialization properties
Common compaction settings
Memtable settings
Common automatic backup settings

Common initialization properties

Be sure to set the properties in the Quick start section as well.

commit_failure_policy

Determines how Cassandra handles commit log disk failures.

die - Shut down the node and kill the JVM, so the node can be replaced.
stop - Shut down the node, leaving the node effectively dead, available for inspection using JMX.
stop_commit - Shut down the commit log, letting writes collect but continuing to service reads.
ignore - Ignore fatal errors and let the batches fail.

Default: stop (recommended)

disk_optimization_strategy

Reading from spinning disks is slow so they are buffered with an extra page of 4KB just in case. This is unnecessary for SSDs so buffer only what is required.

ssd - data directory backed by solid state disks
spinning - data directory backed by spinning disks

Default: ssd

disk_failure_policy

Determines how Cassandra handles disk failures.

die - Shut down gossip and client transports, and kill the JVM for any file system errors or single SSTable errors, so the node can be replaced.
stop_paranoid - Shut down the node, even for single SSTable errors.
stop - Shut down the node leaving the node effectively dead, but the JVM is still available for inspection using JMX.
best_effort - Stop using the failed disk and respond to requests based on the remaining available SSTables. This setting allows obsolete data at consistency level of ONE.
ignore - Ignore fatal errors and lets the requests fail; all file system errors are logged but otherwise ignored.

Recommended policies are stop and best_effort.

+ Default: stop

endpoint_snitch

Configure this property to set the snitch. The most common snitches are:

SimpleSnitch (default) + Uses replication strategy order for proximity. This snitch does not recognize racks or data centers, and considers all nodes as belonging to one ring (single DC) making it incompatible with multi-DC deployments and unsuitable for production environments. + This snitch is appropriate for development environments only.
GossipingPropertyFileSnitch (GPFS) + Uses rack and datacenter information for the local node defined in the cassandra-rackdc.properties file and propagates this information to other nodes via gossip. + This snitch is recommended for production environments and is almost always the correct choice.
PropertyFileSnitch (PFS) + Determines node proximity using the rack and datacenter location defined in the cassandra-topology.properties file. This snitch has been superceded by GPFS and is only provided for backwards compatibility.

For other snitches such as Ec2Snitch and GoogleCloudSnitch, see About snitches.

All nodes in a cluster must use the same snitch.

Replica placement (defines where copies of data is stored) is determined using the information provided by the snitch. Changing the snitch has implications for where the data is located so requires additional steps and should only be performed by experienced operators.

seed_provider

The gossip seed provider and corresponding addresses of nodes that are designated as contact points in the cluster. A joining node contacts the nodes in the seeds list and establishes a connection to the first available node to discover the members of the cluster and topology.

class_name - The class that handles the seed logic. The default is used in almost all clusters however a custom seed provider can be substituted in limited edge cases. + Default: org.apache.cassandra.locator.SimpleSeedProvider

seeds - A comma delimited list of addresses and their corresponding storage_port. A new node joining the cluster uses the list to bootstrap the gossip process. If the cluster has multiple nodes, the default value must be changed to the IP address (and gossip port) of one of the nodes.

Default: "127.0.0.1:7000"

Making every node a seed node is not recommended because of increased maintenance and reduced gossip performance. Gossip optimization is not critical, but it is recommended to use a small seed list of approximately three nodes per datacenter.

Advanced initialization properties

allocate_tokens_for_keyspace: Triggers the algorithm which allocates optimum num_tokens tokens such that token ranges are spread evenly across nodes meaning data is distributed more evenly compared to the legacy random allocation. Only supported on clusters using Murmur3Partitioner.

The replication strategy of the specified keyspace is used by the algorithm for optimizing token allocation when new nodes join a cluster.

The property allocate_tokens_for_local_replication_factor is preferred over allocate_tokens_for_keyspace particularly when adding nodes in a new data center where a keyspace is not yet replicated. If neither property is set, defaults to legacy behaviour where tokens are allocated randomly.

allocate_tokens_for_local_replication_factor: Triggers the algorithm which allocates optimum num_tokens tokens such that token ranges are spread evenly across nodes meaning data is distributed more evenly compared to the legacy random allocation. Only supported on clusters using Murmur3Partitioner.

Specify the replication factor in the local data center (3 for example) that the algorithm will use to optimize token allocation when new nodes join a cluster.

This property is preferred over allocate_tokens_for_keyspace since it does not require the replication of a keyspace to be defined particularly when adding nodes in a new data center. If neither property is set, defaults to legacy behaviour where tokens are allocated randomly.

auto_bootstrap: When joining a cluster for the first time, this property determines whether the node will request replicas to stream data (default behaviour). If the node is defined as a seed, it immediately joins the cluster without data.

Non-seed nodes will bootstrap automatically by default. Set to false when adding nodes in a new data center where bootstrap is manually triggered by an operator with the nodetool rebuild command.

Default: true

broadcast_address: Set to the node’s public IP address in environments where nodes are only able to communicate across networks using their public IP adresses such as multi-region Amazon EC2 deployments. Otherwise, the node will broadcast on the same address as listen_address.

Set a separate listen_address and broadcast_address on a node with multiple network interfaces or where nodes are not able to communicate over private IP addresses. Not required in environments that support automatic switching between private and public communication.

Default: uses value of listen_address

initial_token: The property for manually assigning token(s) for range(s) to be owned by the node.

Specify one token value for legacy single-token clusters. For clusters with virtual nodes enabled, specify multiple tokens as a comma-separated list.

When setting initial_token, the corresponding num_tokens must also be set.

Default: not set in preference for num_tokens

listen_on_broadcast_address: Set to true on nodes with multiple interfaces to enable communication on both listen_address and broadcast_address.

Default: false

num_tokens: Defines the number of tokens to assign to the node.

Early versions of Cassandra used a default value of 256 tokens for clusters with virtual nodes enabled so data is shared with more peers and least variance in data size among nodes in the same data center but leads to decreased availability in the event of node outages.

Lesser token counts such as 4 or 8 have a higher availability but also higher variance in data size. 16 tokens achieves a good distribution of data without compromising too much on availability.

Default: 16 tokens

partitioner: The partitioner determines how data is distributed across the nodes in the cluster.

The default Murmur3Partitioner is the correct and only choice for new clusters. The legacy partitioners are provided for backward-compatibility with existing clusters upgraded from older versions of Cassandra since the partitioner can never be changed on a running cluster.

Default: org.apache.cassandra.dht.Murmur3Partitioner

Common compaction settings

compaction_throughput_mb_per_sec

The rate (in megabytes/second) at which SSTable candidates will be compacted. The faster the database inserts data, the faster the system must compact in order to keep the number of SSTables down.

Set to 16 to 32 times the write throughput in MB/second. Otherwise, set to 0 to disable compaction throttling. A high setting means that more of the disk I/O is used for compaction, leaving less I/O bandwidth for reads.

Default: 64

See Configure compaction.

Memtable settings

When a node receives a write request, the data is stored in a memory structure called memtable as well as appended to the commit log on disk for durability (see How data is written). The memtable segments can be allocated either on- or off-heap.

memtable_allocation_type

Determines how HCD allocates memory to the memtable.

heap_buffers - Memtables are allocated on JVM heap. Suitable for general workloads where heap memory is sufficient.
offheap_buffers - Uses Java NIO direct buffers to store cell names and values off-heap. This allocation type reduces heap utilization significantly, leading to reduced GC pressure.
offheap_objects - Allocates memtables completely off-heap, directly in native memory. This allocation type is recommended particularly for clusters that handle large datasets. Writes are around 5% faster mostly due to memtables flushing less often.
unslabbed_heap_buffers - Allocates memtables on JVM heap without using a slab allocator. This can lead to increased heap fragmentation so is not recommended for any environments.

Default: offheap_objects

memtable_heap_space_in_mb: The maximum amount of memory to allocate for memtables on JVM heap. When the threshold is reached, writes are blocked until a flush completes. A flush of the largest memtable is triggered based on memtable_cleanup_threshold.

Default: ¼ of heap

memtable_offheap_space_in_mb: The maximum amount of memory to allocate for memtables from native memory. When the threshold is reached, writes are blocked until a flush completes. A flush of the largest memtable is triggered based on memtable_cleanup_threshold.

Default: ¼ of heap

memtable_cleanup_threshold: The threshold that triggers a flush based on the ratio of memtable size to the maximum memory size permitted for memtables.

Setting a value is deprecated since the default calculation is the only reasonable choice.

Default: 1 / (memtable_flush_writers + 1)

memtable_flush_writers: The total number of memtables that can be flushed concurrently as well as the number of flush writer threads per disk.

A single thread is generally capable of keeping up with ingesting writes on a node with a single fast disk unless it becomes IO-bound temporarily so two flush writers are usually sufficient. If flushing is falling behind (MemtablePool.BlockedOnAllocation metric is greater than 0), increment the number of flush writers.

Note that more writers can lead to more frequent flushes and smaller SSTables which puts pressure on compactions.

Default: 2 for nodes with a single data directory, otherwise 1 per memtable

Common automatic backup settings

Backups and snapshots are not automatically cleared so disk usage can grow unbounded. When the disk gets full, HCD will automatically shut down by default when it can no longer write files to disk.

DataStax recommends setting up a process to clear incremental backups each time a new snapshot is created.

auto_snapshot: When enabled (set to true), a snapshot is taken before DROP KEYPACE, DROP TABLE, or TRUNCATE TABLE is executed.

DataStax strongly recommends keeping this enabled as a precaution in case the DROP or TRUNCATE is executed accidentally against the wrong keyspace or table.

Default: true

incremental_backups: When enabled (set to true), create hard links to each SSTable that has been flushed or streamed in the backups/ subdirectory of the keyspace data.

Default: false

snapshot_before_compaction: When enabled (set to true), a snapshot is taken before each compaction task. The snapshot may be used as a rollback position in an upgrade. Usage is limited since the general recommendation is to take backups before performing an upgrade.

Use with extreme caution as disk usage can grow exponentially.

Default: false

Performance tuning

Tuning performance and system resource utilization, including commit log, compaction, memory, disk I/O, CPU, reads, and writes.

Performing tuning properties include:

Commit log settings
Change-data-capture (CDC) space settings
Common compaction settings
Memtable settings
Cache and index settings
Streaming settings

Commit log settings

commitlog_sync

Defines the mode by which the commit log is synchronized to disk, in other words when the data is considered fully persisted to storage (will survive a system crash or power outage). The sync mode also determines when a successful write acknowledgement is sent to the coordinator.

batch - Each write request triggers a call to sync immediately. The acknowledgement is blocked until the after the commit log has been flushed to disk. Prioritizes durability over performance.
group - Similar to batch mode but waits up to commitlog_sync_group_window_in_ms between flushes so more writes are persisted together. The acknowledgement is also blocked until the after the commit log has been flushed to disk. Recommended over batch mode.
periodic - Commit log is synced every commitlog_sync_period_in_ms but the write is acknowledged immediately. Prioritizes performance over durability.

Default: periodic

commitlog_sync_period_in_ms

Time interval between commit log syncs to disk. Only set with periodic sync mode, otherwise an exception will be logged.

Default: 10000 (10 seconds)

commitlog_sync_group_window_in_ms

The minimum interval between disk syncs. Only set with group sync mode, otherwise an exception will be logged.

Default: 1000 (1 second)

commitlog_sync_batch_window_in_ms

Deprecated. The maximum delay between disk syncs. No longer used.

commitlog_segment_size_in_mb

The size of individual commit log file segments. A small size means more frequent flushes leading to small SSTables which put pressure on compaction.

If using commit log archives for point-in-time recovery, reducing the size to 16 or 8MB for finer granularity is reasonable but be aware that the maximum mutation size is dependent on the segment size (see below).

Default: 32

max_mutation_size_in_kb: The maximum allowed size of a mutation (the payload size of a write request) which defaults to half of the commit log segment (commitlog_segment_size_in_mb). If explicitly set, you must set commitlog_segment_size_in_mb to at least twice the value of max_mutation_size_in_kb.

Before increasing the commitlog segment size of the commitlog segments, investigate why the mutations are larger than expected. Look for underlying issues with access patterns and data model, because increasing the commitlog segment size is a limited fix.

Default: ½ of commitlog_segment_size_in_mb

commitlog_total_space_in_mb

The maximum disk space for commit logs on disk.

If the limit is reached, the oldest commit log segments get flushed to reclaim disk space. A small size means more frequent flushes on less-active tables leading to small SSTables which put pressure on compaction.

Default: smaller of 8192 or ¼ of commitlog/ disk

See Configure memtable thresholds.

commitlog_compression

By default, the commit log is not compressed. To enable compression, specify the compression library to use.

The supported libraries are:

DeflateCompressor - Legacy option that is the slowest compared to newer algorithms so is not recommended.
LZ4Compressor - Fastest algorithm but offers less compression ratios. Choose when speed is preferred over space savings.
SnappyCompressor - Not as fast as LZ4 but provides better compression.
ZstdCompressor - Provides the best compression ratio but slower than other algorithms.
```
commitlog_compression:
  - class_name: LZ4Compressor
```

Change Data Capture (CDC) settings

Compaction settings

See also compaction_throughput_mb_per_sec in the common compaction settings section and Configure compaction.

concurrent_compactors

The number of compaction threads allowed to run simultaneously. Simultaneous compactions help preserve read performance in a mixed read-write workload by limiting the number of small SSTables that accumulate during a single long-running compaction.

Generally, the calculated default value is appropriate and does not need adjusting. Changes aren’t recommended. If your data directories are backed by SSDs, increase this value to the number of cores.

If compaction running too slowly or too fast, adjust compaction_throughput_mb_per_sec first.

Increasing concurrent compactors leads to more use of available disk space for compaction, because concurrent compactions happen in parallel, especially for STCS. Ensure that adequate disk space is available before increasing this configuration.

Default: fewer of number of data disks or CPU cores, with a minimum of 2 and a maximum of 8

concurrent_validations

The number of repair validation threads allowed to run simultaneously.

Defaults to the value of concurrent_compactors if not configured or set to ⇐ 0. Requires system property -Dcassandra.allow_unlimited_concurrent_validations=true to set validation threads to a value higher than concurrent compactors.

Default: the value of concurrent_compactors

concurrent_materialized_view_builders

The number of view builder tasks allowed to run simultaneously if materialized views are enabled (experimental).

When a view is created, the node ranges are split into [num_processors x 4] builder tasks. Set this property to 2 or higher to build views faster.

Default: 1

sstable_preemptive_open_interval_in_mb

The size of the SSTable candidates to trigger preemptive opening of compaction output.

The compaction process opens SSTables before they are completely written and uses them in place of the prior SSTables for any range previously written. Preemptive opening of SSTables helps to smoothly transfer reads between the SSTables by reducing cache churn and keeps hot rows hot.

A low value has a negative performance impact and will eventually cause heap pressure and GC activity. The optimal value depends on hardware and workload.

Default: 50

Cache and index settings

column_index_size_in_kb: Granularity of the index of rows within a partition. For huge rows, decrease this setting to improve seek time.

Default: 64

file_cache_size_in_mb: Maximum memory to use for caching SSTable chunks and buffer pools. Allocated from native memory in addition to heap.

Default: smaller of 2048 or ¼ of heap

Streaming settings

These settings apply to operations which peprform file streaming including repairs, bootstraps and decommissions. These operations are mostly sequential I/O which can saturate a node’s network bandwidth and degrade client (application) performance so it is important to throttle streaming throughput.

inter_dc_stream_throughput_outbound_megabits_per_sec: Maximum network bandwidth for file transfers (streaming) between data centers. Set to a value less or equal to stream_throughput_outbound_megabits_per_sec.

Default: 200 Mbps (25 MB/s)

stream_entire_sstables: Enables the Zero Copy Streaming feature where eligible SSTables are streamed in their entirety between nodes instead of individual partitions, transferring data at a significantly faster rate.

This feature is bound to the streaming throughput limits and disabled when internode encryption is enabled.

Default: true

stream_throughput_outbound_megabits_per_sec: Maximum network bandwidth permitted for all outbound file transfers (streaming) on a node.

Default: 200 Mbps (25 MB/s)

streaming_keep_alive_period_in_secs: Interval to send keep-alive messages to prevent reset connections during streaming. The streaming session fails when a keep-alive message is not received for two keep-alive cycles equivalent to 10 minutes by default (2 x 300 seconds).

Default: 300

Advanced properties

Less commonly-used settings normally reserved for experienced operators.

max_value_size_in_mb: The maximum size of any value in SSTables up to a maximum of 2048 MB. If any value exceeds this threshold, the SSTables are marked as corrupted.

Default size is the same as the default native protocol frame limit native_transport_max_frame_size_in_mb.

Default: 256
trickle_fsync: Enables flushing portions of SSTables written using sequential writers when trickle_fsync_interval_in_kb is reached. Minimizes sudden flushing of dirty buffers which can impact read latencies.

Recommended for use with SSDs which can handle more frequent calls to fsync(), but may be detrimental to slow HDDs.

Default: true
trickle_fsync_interval_in_kb: Threshold to trigger a flush when trickle_fsync is enabled.

Default: 10240 (10 MB)

`cassandra.yaml` configuration file

Syntax

Organization

Quick start properties

Default directories

Data directory configuration

Commonly used properties

Common initialization properties

Advanced initialization properties

Common compaction settings

Memtable settings

Common automatic backup settings

Performance tuning

Commit log settings

Change Data Capture (CDC) settings

Compaction settings

Cache and index settings

Streaming settings

Advanced properties

Security properties

User-defined functions (UDF)

Continuous paging options

Memory leak detection settings

DataStax Astra emulation

Guardrails

Was this helpful?

Give Feedback

cassandra.yaml configuration file

Was this helpful?

`cassandra.yaml` configuration file