DataStax Enterprise configuration file (dse.yaml)

The configuration file for Kerberos authentication, purging of expired data from the Solr indexes, and setting Solr inter-node communication.

The dse.yaml file is the configuration file for setting the delegated snitch, Kerberos authentication, and purging of expired data from the Solr indexes. It is located in the following directories:

  • Packaged installs: /etc/dse
  • Tarball installs: install_location/resources/dse/conf

For cassandra.yaml configuration, see Node and cluster configuration (cassandra.yaml).

Snitch settings 

The delegated_snitch property sets which snitch is delegated. For example, it sets the DseSimpleSnitch.

  • delegated_snitch

    (Default: com.datastax.bdp.snitch.DseSimpleSnitch) - Sets which snitch is used.

  • DseSimpleSnitch

    The DseSimpleSnitch places Cassandra, Hadoop, and Solr nodes into separate data centers. See DseSimpleSnitch.

For more information, see Snitches in the Cassandra documentation.

Kerberos support 

The kerberos_options set the QOP (Quality of Protection) and encryption options.

Options:

  • keytab: resources/dse/conf/dse.keytab
  • service_principal: dse/_HOST@REALM
  • http_principal: HTTP/_HOST@REALM
  • qop - auth A comma-delimited list of Quality of Protection values that clients and servers can use for each connection. The valid values are:

    • auth - (Default) Authentication only.
    • auth-int - Authentication plus integrity protection for all transmitted data.
    • auth-conf - Authentication plus integrity protection and encryption of all transmitted data.

      Encryption using auth-conf is separate and completely independent of whether encryption is done using SSL. If both auth-conf and SSL are enabled, the transmitted data is encrypted twice. DataStax recommends choosing one and using it for both encryption and authentication.

Multi-threaded indexing 

DSE Search provides multi-threaded indexing implementation to improve performance on multi-core machines. All index updates are internally dispatched to a per-core indexing thread pool and executed asynchronously, which allows for greater concurrency and parallelism. However, index requests can return a response before the indexing operation is executed.

max_solr_concurrency_per_core

(Default: number of available Solr cores times 2) Configures the maximum number of concurrent asynchronous indexing threads per Solr core. If set to 1, DSE Search returns to the synchronous indexing behavior.

back_pressure_threshold_per_core

(Default: 10000) The total number of queued asynchronous indexing requests per Solr core, computed at Solr commit time. When exceeded, back pressure prevents excessive resources consumption by throttling new incoming requests.

Scheduler settings for Solr indexes 

These settings control the schedulers in charge of querying for and removing expired data.

ttl_index_rebuild_options

Options:

  • fix_rate_period - (Default: 300 seconds) Schedules how often to check for expired data.
  • initial_delay - (Default: 20 seconds) Speeds up startup by delaying the first TTL checks.
  • max_docs_per_batch - (Default: 200) The maximum number of documents deleted per batch by the TTL rebuild thread.

flush_max_time_per_core

(Default: 5 minutes) The maximum time to wait before flushing asynchronous index updates, which occurs at either at Solr commit time or at Cassandra flush time. To fully synchronize Solr indexes with Cassandra data, ensure that flushing completes successfully by setting this value to a reasonable high value.