Configuration properties
Reference information for DSE Search configuration properties.
dse.yaml
The location of the dse.yaml file depends on the type of installation:Package installations | /etc/dse/dse.yaml |
Tarball installations | installation_location/resources/dse/conf/dse.yaml |
cassandra.yaml
The location of the cassandra.yaml file depends on the type of installation:Package installations | /etc/dse/cassandra/cassandra.yaml |
Tarball installations | installation_location/resources/cassandra/conf/cassandra.yaml |
Reference information for DSE Search configuration properties.
Data location in cassandra.yaml
- data_file_directories
- The directory where table data is stored on disk. The
database distributes data evenly across the
location, subject to the granularity of the
configured compaction strategy. If not set, the
directory is
$DSE_HOME/data/data.Tip: For production, DataStax recommends RAID 0 and SSDs.
Default:
-
/var/lib/cassandra/data
Scheduler settings in dse.yaml
- ttl_index_rebuild_options
- Section of options to control the schedulers in charge of querying for and removing expired records, and the execution of the checks.
- fix_rate_period
- Time interval to check for expired data in seconds.
Default:
300
- initial_delay
- The number of seconds to delay the first TTL check to speed up start-up time.
Default:
20
- max_docs_per_batch
- The maximum number of documents to check and delete per batch by the TTL rebuild
thread. All documents determined to be expired are deleted from the index during each
check, to avoid memory pressure, their unique keys are retrieved and deletes issued in
batches.
Default:
4096
- thread_pool_size
- The maximum number of cores that can execute TTL cleanup concurrently. Set the
thread_pool_size to manage system resource consumption and prevent many search cores
from executing simultaneous TTL deletes.
Default:
1
Indexing settings in dse.yaml
- solr_resource_upload_limit_mb
- Option to disable or configure the maximum file size of the search index config or
schema. Resource files can be uploaded, but the search index config and schema are
stored internally in the database after upload.
- 0 - disable resource uploading
- upload size - The maximum upload size limit in megabytes (MB) for a DSE Search resource file (search index config or schema).
Default:
10
- flush_max_time_per_core
- The maximum time, in minutes, to wait for the flushing of asynchronous index updates
that occurs at DSE Search commit time or at flush time. Expert level knowledge is
required to change this value. Always set the value reasonably high to ensure flushing
completes successfully to fully sync DSE Search indexes with the database data. If the
configured value is exceeded, index updates are only partially committed and the
commit log is not truncated which can undermine data durability.Note: When a timeout occurs, it usually means this node is being overloaded and cannot flush in a timely manner. Live indexing increases the time to flush asynchronous index updates.
Default: commented out (
5
) - load_max_time_per_core
- The maximum time, in minutes, to wait for each DSE Search index to load on startup
or create/reload operations. This advanced option should be changed only if exceptions
happen during search index loading. When not set, the default is 5
minutes.
Default: commented out (
5
) - enable_index_disk_failure_policy
- Whether to apply the configured disk failure policy if IOExceptions occur during
index update operations.
- true - apply the configured Cassandra disk failure policy to index write failures
- false - do not apply the disk failure policy
Default: commented out (
false
) - solr_data_dir
- The directory to store index data. For example:
See Managing the location of DSE Search data.By default, each DSE Search index is saved in solr_data_dir/keyspace_name.table_name, or as specified by thesolr_data_dir: /var/lib/cassandra/solr.data
dse.solr.data.dir
system property.Default: commented out
- solr_field_cache_enabled
- The Apache Lucene® field cache is deprecated. Instead, for fields that are sorted,
faceted, or grouped by, set docValues="true" on the field in the search index schema.
Then reload the search index and reindex. When not set, the default is false.
Default: commented out (
false
)
- async_bootstrap_reindex
- For DSE Search, configure whether to asynchronously reindex bootstrapped data.
Default: false
- If enabled, the node joins the ring immediately after bootstrap and reindexing occurs asynchronously. Do not wait for post-bootstrap reindexing so that the node is not marked down. The dsetool ring command can be used to check the status of the reindexing.
- If disabled, the node joins the ring after reindexing the bootstrapped data.
Safety thresholds
Configure safety thresholds and fault tolerance for DSE Search with options in dse.yaml and cassandra.yaml.- Safety thresholds in cassandra.yaml
- Configuration options include:
- read_request_timeout_in_ms
- Default: 5000. How long the coordinator waits for read operations to complete before timing it out.
- Security in dse.yaml
- Security options for DSE Search. See .
- solr_encryption_options
- Settings to tune encryption of search indexes.
- decryption_cache_offheap_allocation
- Whether to allocate shared DSE Search decryption cache off JVM heap.
- true - allocate shared DSE Search decryption cache off JVM heap
- false - do not allocate shared DSE Search decryption cache off JVM heap
Default: commented out (
true
) - decryption_cache_size_in_mb
- The maximum size of shared DSE Search decryption cache in megabytes (MB).
Default: commented out (
256
) - http_principal
- The http_principal is used by the Tomcat application container to run DSE Search. The Tomcat web server uses the GSSAPI mechanism (SPNEGO) to negotiate the GSSAPI security mechanism (Kerberos). Set REALM to the name of your Kerberos realm. In the Kerberos principal, REALM must be uppercase.
- Inter-node communication in dse.yaml
- Inter-node communication between DSE Search nodes.
- shard_transport_options
- Fault tolerance option for inter-node communication between DSE Search nodes.
- netty_client_request_timeout
- Timeout behavior during distributed queries. The internal timeout for all search
queries to prevent long running queries. The client request timeout is the maximum
cumulative time (in milliseconds) that a distributed search request will wait idly
for shard responses.
Default:
60000
(1 minute)
- Query options in dse.yaml
- Options for CQL Solr queries.
- cql_solr_query_paging
- driver - Respects driver paging settings. Specifies to use Solr pagination (cursors) only when the driver uses pagination. Enabled automatically for DSE SearchAnalytics workloads.
- off - Paging is off. Ignore driver paging settings for CQL queries and use normal Solr paging unless:
- The current workload is an analytics workload, including SearchAnalytics. SearchAnalytics nodes always use driver paging settings.
- The cqlsh query parameter paging is set to driver.
Even when
cql_solr_query_paging: off
, paging is dynamically enabled with the"paging":"driver"
parameter in JSON queries.
Default: commented out (
off
)- cql_solr_query_row_timeout
- The maximum time in milliseconds to wait for each row to be read from the database
during CQL Solr queries.
Default: commented out (
10000
10 seconds)
- Client connections in dse.yaml
- The default IP address that the HTTP and Solr Admin interface uses to access DSE Search. See .
- native_transport_address
- When left blank, uses the configured hostname of the
node. Unlike the
listen_address
, this value can be set to 0.0.0.0, but you must set the native_transport_broadcast_address to a value other than 0.0.0.0.Note: Set native_transport_address OR native_transport_interface, not both.Default:
localhost
- Performance in cassandra.yaml
- Decreasing the memtable space to make room for Solr caches might improve performance. See .
- memtable_heap_space_in_mb
- The amount of on-heap memory allocated for memtables.
The database uses the total of this amount and the
value of memtable_offheap_space_in_mb to set a
threshold for automatic memtable flush.
Default: calculated 1/4 of heap size (
2048
)
- Performance in dse.yaml
- Node routing options.
- node_health_options
- Node health options are always enabled.
- refresh_rate_ms
- Default: 60000
- uptime_ramp_up_period_seconds
- The amount of continuous uptime required for the node's uptime score to advance the
node health score from 0 to 1 (full health),
assuming there are no recent dropped mutations. The health score is a composite score
based on dropped mutations and uptime. Tip: If a node is repairing after a period of downtime, you might want to increase the uptime period to the expected repair time.
Default: commented out (
10800
3 hours) - dropped_mutation_window_minutes
- The historic time window over which the rate of dropped mutations affect the node
health score.
Default:
30