Configuration properties

Where is the cassandra.yaml file?

The location of the cassandra.yaml file depends on the type of installation:

Installation Type Location

Installation Type	Location
Package installations + Installer-Services installations	`/etc/dse/cassandra/cassandra.yaml`
Tarball installations + Installer-No Services installations	`<installation_location>/resources/cassandra/conf/cassandra.yaml`

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra.yaml

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/cassandra.yaml

Where is the dse.yaml file?

The location of the dse.yaml file depends on the type of installation:

Installation Type Location

Installation Type	Location
Package installations + Installer-Services installations	`/etc/dse/dse.yaml`
Tarball installations + Installer-No Services installations	`<installation_location>/resources/dse/conf/dse.yaml`

Package installations + Installer-Services installations

/etc/dse/dse.yaml

Tarball installations + Installer-No Services installations

<installation_location>/resources/dse/conf/dse.yaml

Reference information for DSE Search configuration properties.

Data location in cassandra.yaml
Scheduler settings in dse.yaml
Indexing resources in dse.yaml
Indexing settings in dse.yaml
Safety thresholds in cassandra.yaml
Inter-node communication in dse.yaml
Query options in dse.yaml
Client connections in dse.yaml
Performance in cassandra.yaml
Performance in dse.yaml

Data location in cassandra.yaml

See Set the location of search indexes.

data_file_directories: The directory location where table data is stored (in SSTables). The database distributes data evenly across the location, subject to the granularity of the configured compaction strategy. Default locations: /var/lib/cassandra/data.

For production, DataStax recommends RAID 0 and SSDs.

Scheduler settings in `dse.yaml`

Configuration options to control the scheduling and execution of indexing checks.

ttl_index_rebuild_options: To ensure that records with TTLs are purged from search indexes when they expire, the search indexes are periodically checked for expired documents. The ttl_index_rebuild_options settings control the schedulers in charge of querying for and removing expired records, and the execution of the checks.
fixed_rate_period: Schedules how often to check for expired data in seconds. Default: 300
initial_delay: Speeds startup time by delaying the first TTL checks in seconds. Default: 20
max_docs_per_batch: Sets the maximum number of documents to check and delete per batch by the TTL rebuild thread. Default: 4096
thread_pool_size: To manage system resource consumption and prevent many search cores from executing simultaneous TTL deletes, defines the maximum number of cores that can execute TTL cleanup concurrently. Default: 1

Indexing resources in `dse.yaml`

solr_resource_upload_limit_mb: Default: 10. You can configure the maximum resource file size or disable resource upload Sets the maximum DSE Search resource upload size limit in megabytes (MB). Set to 0 to disable resource uploading.

Indexing settings in `dse.yaml`

max_solr_concurrency_per_core

Configures the maximum number of concurrent asynchronous indexing threads per DSE Search index. Default: number_of_available_CPU_cores.

If set to 1, DSE Search reverts to using synchronous indexing behavior, where data is synchronously written to the database in a single thread and indexed for DSE Search.

To achieve optimal performance, assign this value to number of available CPU cores divided by the number of search cores. For example, with 16 CPU cores and 4 search cores, the suggested value is 4. Also see Tuning search for maximum indexing throughput.

To prevent writes from overwhelming reads, reduce this value and adjust parallelDeleteTasks in the search index config.

Dynamic switching to search concurrency level at 1 is disallowed.

enable_back_pressure_adaptive_nrt_commit

Allows back pressure system to adapt max auto soft commit time (defined per search index config) to the actual load. Setting is respected only for NRT (near real time) cores. When DSE search cores have real-time (RT) live indexing, adaptive commits are disabled regardless of this property value. See live indexing with RT.

Default: true

back_pressure_threshold_per_core

The total number of queued asynchronous indexing requests per search core. When this number is exceeded, back pressure prevents excessive resource consumption by throttling new incoming requests. DataStax recommends using a back_pressure_threshold_per_core value of 1000 * max_solr_concurrency_per_core.

Default: 2000

flush_max_time_per_core

The maximum time, in minutes, to wait for the flushing of asynchronous index updates, which occurs at DSE Search commit time or at flush time. Expert level knowledge is required to change this value. Always set the value reasonably high to ensure flushing completes successfully to fully sync DSE Search indexes with the database data. If the configured value is exceeded, index updates are only partially committed, and the commit log is not truncated to ensure data durability.

When a timeout occurs, it usually means this node is being overloaded and cannot flush in a timely manner. Live indexing increases the time to flush asynchronous index updates.

Default: 5

load_max_time_per_core

The maximum time, in minutes, to wait for each DSE Search index to load on startup or create/reload operations, expressed. This advanced option should be changed only if exceptions happen during core loading.

Default: 5 (if not specified)

enable_index_disk_failure_policy

DSE Search activates the configured disk failure policy if IOExceptions occur during index update operations.

Default: false

solr_data_dir

The directory to store index data. For example:

solr_data_dir: /var/lib/cassandra/solr.data

See Managing the location of DSE Search data.By default, each DSE Search index is saved in solr_data_dir/keyspace_name.table_name, or as specified by the dse.solr.data.dir system property.

Default: commented out

solr_field_cache_enabled

The Apache Lucene® field cache is deprecated. Instead, for fields that are sorted, faceted, or grouped by, set docValues="true" on the field in the schema.xml file. Then reload the core and reindex. The default value is false. To override false, set useFieldCache=true in the request.

async_bootstrap_reindex

For DSE Search, configure whether to asynchronously reindex bootstrapped data. Default: false

If enabled, the node joins the ring immediately after bootstrap and reindexing occurs asynchronously. Do not wait for post-bootstrap reindexing so that the node is not marked down. The dsetool ring command can be used to check the status of the reindexing.
If disabled, the node joins the ring after reindexing the bootstrapped data.

Safety thresholds

Configure safety thresholds and fault tolerance for DSE Search with options in dse.yaml and cassandra.yaml.

Safety thresholds in cassandra.yaml: Configuration options include:
read_request_timeout_in_ms: Default: 5000. The number of milliseconds that the coordinator waits for read operations to complete before timing it out.
Security in dse.yaml: Security options for DSE Search. See DSE Search security checklist.
solr_encryption_options: Specify settings to tune encryption of search indexes.
decryption_cache_offheap_allocation: Specify whether to allocate shared DSE Search decryption cache off JVM heap. Default: true
decryption_cache_size_in_mb: Sets the maximum size of shared DSE Search decryption cache, in megabytes (MB). Default: 256
http_principal: The http_principal is used by the Tomcat application container to run DSE Search. The Tomcat web server uses GSS-API mechanism (SPNEGO) to negotiate the GSSAPI security mechanism (Kerberos). Set REALM to the name of your Kerberos realm. In the Kerberos principal, REALM must be uppercase.

Inter-node communication in dse.yaml: Inter-node communication between DSE Search nodes.
shard_transport_options: For inter-node communication between DSE Search nodes.
netty_client_request_timeout: Default: 60000. The client request timeout is the maximum cumulative time (in milliseconds) that a distributed search request will wait idly for shard responses. Defines timeout behavior during distributed queries.

Query options in dse.yaml

Options for CQL Solr queries.

cql_solr_query_paging

Options to specify the paging behavior.

off - Default. Paging is off. Ignore driver paging settings for CQL Solr queries and use normal Solr paging unless:
- The current workload is an analytics workload, including SearchAnalytics. SearchAnalytics nodes always use driver paging settings.
- The cqlsh query parameter paging is set to driver.
  
  Even when cql_solr_query_paging: off, paging is dynamically enabled with the "paging":"driver" parameter in JSON queries.
driver - Respects driver paging settings. Specifies to use Solr pagination (cursors) only when the driver uses pagination. Enabled automatically for DSE SearchAnalytics workloads.

cql_solr_query_row_timeout

The maximum time in milliseconds to wait for each row to be read from the database during CQL Solr queries. Default: 10000 (10 seconds).

Client connections in dse.yaml

The default IP address that the HTTP and Solr Admin interface uses to access DSE Search. See Changing Tomcat web server settings.

rpc_address

Default: localhost. The listen address for client connections (Thrift RPC service and native transport). Valid values:

unset:

Resolves the address using the configured hostname configuration of the node. If left unset, the hostname resolves to the IP address of this node using /etc/hostname, /etc/hosts, or DNS.
0.0.0.0:

Listens on all configured interfaces. You must set the broadcast_rpc_address to a value other than 0.0.0.0.
IP address
hostname Related information: Network

Performance in cassandra.yaml: Decreasing the memtable space to make room for Solr caches might improve performance. See Changing the stack size and memtable space.
concurrent_writes: Default: 32. note Writes in DSE are rarely I/O bound, so the ideal number of concurrent writes depends on the number of CPU cores on the node. The recommended value is 8 x number_of_cpu_cores.
memtable_heap_space_in_mb: Default: 1/4 of heap size. note

The amount of on-heap memory allocated for memtables. The database uses the total of this amount and the value of memtable_offheap_space_in_mb to set a threshold for automatic memtable flush. For details, see memtable_cleanup_threshold.

Related information: Tuning the Java heap

Performance in dse.yaml: Node routing options.
node_health_options: Node health options are always enabled for all nodes. Node health is a score-based representation of how fit a node is to handle search queries.
refresh_rate_ms: Default: 60000
uptime_ramp_up_period_seconds: Default: 10800 (3 hours). The amount of continuous uptime required for the node’s uptime score to advance the node health score from 0 to 1 (full health), assuming there are no recent dropped mutations. The health score is a composite score based on dropped mutations and uptime.

If a node is repairing after a period of downtime, you might want to increase the uptime period to the expected repair time.

dropped_mutation_window_minutes: Default: 30. The historic time window over which the rate of dropped mutations affect the node health score.

Configuration properties

Data location in cassandra.yaml

Scheduler settings in `dse.yaml`

Indexing resources in `dse.yaml`

Indexing settings in `dse.yaml`

Safety thresholds

Was this helpful?

Give Feedback

Configuration properties

Data location in cassandra.yaml

Scheduler settings in dse.yaml

Indexing resources in dse.yaml

Indexing settings in dse.yaml

Safety thresholds

Was this helpful?

Scheduler settings in `dse.yaml`

Indexing resources in `dse.yaml`

Indexing settings in `dse.yaml`