Configuring multi-threaded indexing threads

Multi-threaded asynchronous indexing improves performance on machines that have multiple CPU cores.

DSE Search provides multi-threaded asynchronous indexing with a back pressure mechanism to avoid saturating available memory and to maintain stable performance. Multi-threaded indexing improves performance on machines that have multiple CPU cores. All index updates are internally dispatched to a per CPU core indexing thread pool and executed asynchronously.

This multi-threaded indexing implementation allows for greater concurrency and parallelism, but as a consequence, index requests return a response before the indexing operation is actually executed.

DSE Search also provides advanced JMX-based, configurability, and visibility through the IndexPool JMX MBean.

The location of the dse.yaml file depends on the type of installation:
Installer-Services /etc/dse/dse.yaml
Package installations /etc/dse/dse.yaml
Installer-No Services install_location/resources/dse/conf/dse.yaml
Tarball installations install_location/resources/dse/conf/dse.yaml

Procedure

  1. In the dse.yaml file, define the number of indexing threads per Solr core with the max_solr_concurrency_per_core option. To achieve optimal performance, assign this value to number of available CPU cores divided by the number of Solr cores. For example, with 12 CPU cores and 3 Solr cores, the suggested value is 4.

    If set to 1, DSE Search uses the legacy synchronous indexing implementation.

  2. In the dse.yaml file, define the number of buffered asynchronous index updates per Solr core before the back-pressure is activated with the back_pressure_threshold_per_core option. The default value is 1000 times the number of available CPU cores.
    When the back pressure threshold is reached, new incoming requests are throttled up to a maximum of 80% of the write_request_timeout_in_ms pause per request. The write_request_timeout_in_ms in the cassandra.yaml file defines how long the coordinator should wait for writes to complete.
  3. To monitor the indexing performance of each Solr core, use these JMX attributes in the com.datastax.bdp:type=search,index=solr_core,name=IndexPool mbean:
    • TotalQueueSize: Total number of buffered asynchronous index updates.
    • Throughput: Current one minute rate of executed index updates per second.
    • BackPressurePauseNanos: Current one minute average of applied back pressure pause per request.