DataStax Enterprise configuration file (dse.yaml)
The configuration file for Kerberos authentication, purging of expired data from the Solr indexes, setting Solr inter-node communication, adjusting disk health intervals, and enabling the Performance Service.
- Installer-Services and Package installations: /etc/dse/dse.yaml
- Installer-No Services and Tarball installations: install_location/resources/dse/conf/dse.yaml
For cassandra.yaml configuration, see Node and cluster configuration (cassandra.yaml).
Snitch settings
The delegated_snitch property sets which snitch is delegated. For example, it sets the DseSimpleSnitch.
- delegated_snitch
Default: com.datastax.bdp.snitch.DseSimpleSnitch - Sets which snitch is used.
- DseSimpleSnitch
The DseSimpleSnitch places Cassandra, Hadoop, and Solr nodes into separate data centers. See DseSimpleSnitch.
For more information, see Snitches in the Cassandra documentation.
Kerberos support
The kerberos_options set the QOP (Quality of Protection) and encryption options.
kerberos_options: keytab: path_to_keytab/dse.keytab service_principal: dse_user/_HOST@REALM http_principal: HTTP/_HOST@REALM qop: auth
- keytab: resources/dse/conf/dse.keytab
The keytab file must contain the credentials for both of the fully resolved principal names, which replace _HOST with the FQDN of the host in the
service_principal
andhttp_principal
settings. The UNIX user running DSE must also have read permissions on the keytab. - service_principal:
dse_user/_HOST@REALM
The service_principal that the Cassandra and Hadoop processes run under must use the form dse_user/_HOST@REALM, where
dse_user
is:- Installer-Services and Package installations: cassandra
- Package installations: the name of the UNIX user that starts the service
- http_principal:
HTTP/_HOST@REALM
The http_principal is used by the tomcat application container to run DSE Search/Solr. The web server uses GSS-API mechanism (SPNEGO) to negotiate the GSSAPI security mechanism (Kerberos). Set REALM to the name of your Kerberos realm. In the Kerberos principal, REALM must be uppercase.
qop - auth
A comma-delimited list of Quality of Protection values that clients and servers can use for each connection. The valid values are:- auth - Default: Authentication only.
- auth-int - Authentication plus integrity protection for all transmitted data.
- auth-conf - Authentication plus integrity protection and
encryption of all transmitted data.
Encryption using auth-conf is separate and completely independent of whether encryption is done using SSL. If both auth-conf and SSL are enabled, the transmitted data is encrypted twice. DataStax recommends choosing only one method and using it for both encryption and authentication.
Scheduler settings for Solr indexes
These settings control the schedulers in charge of querying for and removing expired data.
ttl_index_rebuild_options- fix_rate_period - Default: 300 seconds. Schedules how often to check for expired data.
- initial_delay - Default: 20 seconds. Speeds up start-up by delaying the first TTL checks.
- max_docs_per_batch - Default: 200. The maximum number of documents deleted per batch by the TTL rebuild thread.
Solr shard transport options
For inter-node communication between Solr nodes.
- shard_transport_options
- type - Default: netty. Starting in 4.5.0 netty is used for TCP-based communication. It provides lower latency, improved throughput, and reduced resource consumption than http transport, which uses standard a HTTP-based interface for communication.
- netty_server_port - Default: 8984. The TCP listen port. This setting is mandatory if you either want to use the netty transport now or later migrate to it. To use http transport, either comment out this setting or change it to -1.
- netty_server_acceptor_threads - Default: number of available processors. - The number of server acceptor threads.
- netty_server_worker_threads - Default: number of available processors * 8. The number of server worker threads.
- netty_client_worker_thread - Default: number of available processors * 8. The number of client worker threads.
- netty_client_max_connections - Default: 100. The maximum number of client connections.
- netty_client_request_timeout - Default: 60000. The client request timeout, in milliseconds.
- HTTP transport settings
The defaults for are the same as Solr, that is 0, meaning no timeout at all. To avoid blocking operations, DataStax strongly recommends to changing these settings to a finite value. These settings are valid across Solr cores
- http_shard_client_conn_timeout - Default: 0. HTTP shard client timeouts in milliseconds.
- http_shard_client_socket_timeout - Default: 0. HTTP shard client socket timeouts in milliseconds.
Solr indexing
DSE Search provides multi-threaded indexing implementation to improve performance on multi-core machines. All index updates are internally dispatched to a per-core indexing thread pool and executed asynchronously, which allows for greater concurrency and parallelism. However, index requests can return a response before the indexing operation is executed.
- max_solr_concurrency_per_core - Default: number of available Solr cores * 2. Configures the maximum number of concurrent asynchronous indexing threads per Solr core. If set to 1, DSE Search returns to the synchronous indexing behavior.
- back_pressure_threshold_per_core - Default: 500. The total number of queued asynchronous indexing requests per Solr core, computed at Solr commit time. When exceeded, back pressure prevents excessive resources consumption by throttling new incoming requests.
- flush_max_time_per_core - Default: 5 minutes. The maximum time to wait before flushing asynchronous index updates, which occurs at either at Solr commit time or at Cassandra flush time. To fully synchronize Solr indexes with Cassandra data, ensure that flushing completes successfully by setting this value to a reasonable high value.
DSE Performance Service options
These settings are used by the Performance Service to configure how it collects performance metrics on Cassandra nodes.
- CQL slow log settings
- cql_slow_log_threshold_ms - Default:100
- cql_slow_log_ttl - Default: 86400
For detailed information, see Collecting slow queries.
- cql_system_info_options
- enabled - Default: false
- refresh_rate_ms: - Default: 10000
For detailed information, see Collecting system level diagnostics.
- resource_level_latency_tracking_options
- enabled - Default: false
- refresh_rate_ms - Default: 10000
For detailed information, see Collecting system level diagnostics.
- db_summary_stats_options
- enabled - Default: false
- refresh_rate_ms - Default: 10000
For detailed information, see Collecting database summary diagnostics.
- cluster_summary_stats_options
- enabled - Default: false
- refresh_rate_ms - Default: 10000
For detailed information, see Collecting cluster summary diagnostics.
- histogram_data_options
- enabled- Default: false
- refresh_rate_ms - Default: 10000
- retention_count - Default: 3
For detailed information, see Collecting table histogram diagnostics.
- user_level_latency_tracking_options
- enabled - Default: false
- refresh_rate_ms - Default: 10000
- top_stats_limit - Default: 100
For detailed information, see Collecting user activity diagnostics.