Configuration properties
Reference information for DSE Search configuration properties.
Data location in cassandra.yaml
- data_file_directories
-
The directory where table data is stored on disk. The database distributes data evenly across the location, subject to the granularity of the configured compaction strategy.
For production, DataStax recommends RAID 0 and SSDs.
Default:
/var/lib/cassandra/data
Scheduler settings in dse.yaml
Configuration options to control the scheduling and execution of indexing checks.
- ttl_index_rebuild_options
-
Configures the schedulers in charge of querying for expired records, removing expired records, and the execution of the checks.
- fixed_rate_period
-
Time interval in seconds to check for expired data in seconds.
Default: 300
- initial_delay
-
The number of seconds to delay the first TTL check to speed up start-up time.
Default: 20
- max_docs_per_batch
-
The maximum number of documents to check and delete per batch by the TTL rebuild thread. All expired documents are deleted from the index during each check. To avoid memory pressure, their unique keys are retrieved and then deletes are issued in batches.
Default: 4096
- thread_pool_size
-
The maximum number of search indexes (cores) that can execute TTL cleanup concurrently. Manages system resource consumption and prevents many search cores from executing simultaneous TTL deletes.
Default: 1
Indexing settings in dse.yaml
- solr_resource_upload_limit_mb
-
Configures the maximum file size of the search index config or schema. Resource files can be uploaded, but the search index config and schema are stored internally in the database after upload.
-
0 - Disable resource uploading.
-
upload size - The maximum upload size limit in megabytes (MB) for a DSE Search resource file (search index config or schema).
Default: 10
-
- flush_max_time_per_core
-
Only modify this setting if you are an experienced user and you fully understand the implications of changing it.
The maximum time, in minutes, to wait for the flushing of asynchronous index updates that occurs at DSE Search commit time or at flush time.
Always set the wait time high enough to ensure flushing completes successfully to fully sync DSE Search indexes with the database data. If the wait time is exceeded, index updates are only partially committed, and the commit log isn’t truncated. This can undermine data durability.
When a timeout occurs, this node is typically overloaded and cannot flush in a timely manner. Live indexing increases the time to flush asynchronous index updates.
Default: 5
- load_max_time_per_core
-
The maximum time, in minutes, to wait for each DSE Search index to load on startup or create/reload operations. This advanced option should be changed only if exceptions happen during search index loading.
Default: 5
- enable_index_disk_failure_policy
-
Whether to apply the configured disk failure policy if IOExceptions occur during index update operations:
-
true: Apply the configured Apache Cassandra® disk failure policy to index write failures. -
false(default): Do not apply the disk failure policy.
-
- solr_data_dir
-
The directory to store index data. See Set the location of search indexes. By default, each DSE Search index is saved in
<solr_data_dir>/<keyspace_name>.<table_name>or as specified by thedse.solr.data.dirsystem property.Default: A
solr.datadirectory in thecassandradata directory, like/var/lib/cassandra/solr.data - solr_field_cache_enabled
-
The Apache Lucene® field cache is deprecated. Instead, for fields that are sorted, faceted, or grouped by, set
docValues="true"on the field in the search index schema. Then reload the search index and reindex.Default: false
- async_bootstrap_reindex
-
For DSE Search, configure whether to asynchronously reindex bootstrapped data:
-
true: The node joins the ring immediately after bootstrap, and reindexing occurs asynchronously. Doesn’t wait for post-bootstrap reindexing so that the node isn’t marked down. Thedsetool ringcommand can be used to check the status of the reindexing. -
false(default): The node joins the ring after reindexing the bootstrapped data.
-
Safety thresholds
Configure safety thresholds and fault tolerance for DSE Search with options in dse.yaml and cassandra.yaml.
Safety thresholds in cassandra.yaml
Configuration options include:
- read_request_timeout_in_ms
-
How long the coordinator waits for read operations to complete before timing it out.
Default: 5000 (5 seconds)
Security in dse.yaml
Security options for DSE Search.
- solr_encryption_options
-
Tunes encryption of search indexes.
- decryption_cache_offheap_allocation
-
Allocates shared DSE Search decryption cache off JVM heap:
-
true(default): Allocate shared DSE Search decryption cache off JVM heap. -
false: Do not allocate shared DSE Search decryption cache off JVM heap.
-
- decryption_cache_size_in_mb
-
The maximum size of the shared DSE Search decryption cache in megabytes (MB).
Default: 256
- http_principal
-
Used by the Tomcat application container to run DSE Search. The Tomcat web server uses the GSSAPI mechanism (SPNEGO) to negotiate the GSSAPI security mechanism (Kerberos). The default is
HTTP/_HOST@REALM, whereREALMis the name of your Kerberos realm. In the Kerberos principal,REALMmust be uppercase.
Internode communication in dse.yaml
Internode communication between DSE Search nodes.
- shard_transport_options
-
Fault tolerance option for internode communication between DSE Search nodes.
- netty_client_request_timeout
-
Timeout behavior during distributed queries. The internal timeout for all search queries to prevent long running queries. The client request timeout is the maximum cumulative time (in milliseconds) that a distributed search request will wait idly for shard responses.
Default: 60000 (1 minute)
Query options in dse.yaml
Options for CQL Solr queries.
- cql_solr_query_paging
-
-
driver: Respects driver paging settings. Uses Solr pagination (cursors) only when the driver uses pagination. This is enabled automatically for DSE SearchAnalytics workloads. Otherwise, the default isoff. -
off: Ignore driver paging settings for CQL queries and use normal Solr paging unless overridden. Either of the following conditions overridecql_solr_query_paging: off:-
The current workload is an analytics workload, including SearchAnalytics. SearchAnalytics nodes always use driver paging settings.
-
The
cqlshquery parameterpagingis set todriver. Even whencql_solr_query_paging: off, paging can be dynamically enabled with the"paging":"driver"parameter in JSON queries.
-
-
- cql_solr_query_row_timeout
-
The maximum time in milliseconds to wait for all rows to be read from the database during CQL Solr queries.
Default: 10000 (10 seconds)
Client connections in dse.yaml
The default IP address that the HTTP and Solr Admin interface uses to access DSE Search. See Changing Tomcat web server settings.
- native_transport_address
-
When empty or unset, uses the configured hostname of the node.
Unlike
listen_address, thenative_transport_addresscan be set to0.0.0.0, but only if you setnative_transport_broadcast_addressto a value other than0.0.0.0.Set
native_transport_addressornative_transport_interface, but not both.Default:
localhost
Performance in cassandra.yaml
Decreasing the memtable space to make room for Solr caches might improve performance. See Changing the stack size and memtable space.
Performance in dse.yaml
Node routing options.
- node_health_options
-
Node health options are always enabled. Node health is a score-based representation of how healthy a node is to handle search queries. See Collecting node health and indexing status scores.
- refresh_rate_ms
-
How frequently statistics update.
Default: 60000
- uptime_ramp_up_period_seconds
-
The amount of continuous uptime required for the node’s uptime score to advance the node health score from 0 to 1 (full health), assuming there are no recent dropped mutations. The health score is a composite score based on dropped mutations and uptime.
If a node is repairing after a period of downtime, increase the uptime period to the expected repair time.
Default: 10800 (3 hours)
- dropped_mutation_window_minutes
-
The historic time window over which the rate of dropped mutations affects the node health score.
Default: 30