Configuring and tuning indexing performance

Configuring and tuning DSE Search for maximum indexing throughput

dse.yaml

The location of the dse.yaml file depends on the type of installation:
Package installations /etc/dse/dse.yaml
Tarball installations installation_location/resources/dse/conf/dse.yaml

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml
There are two indexing modes in DSE Search:
  • Near-real-time (NRT) indexing is the default indexing mode for Apache Solr™ and Apache Lucene®.
  • Live indexing, also called real-time (RT), indexing supports searching directly against the Lucene RAM buffer and more frequent, cheaper soft-commits, which provide earlier visibility to newly indexed data. However, RT indexing requires a larger RAM buffer and more memory usage than an otherwise equivalent NRT setup.

How many CPUs do you have?

Before you tune anything, determine how many physical CPUs you have. The JVM doesn't know if CPUs are using hyper-threading or not.

Tuning indexing performance

The default mergeScheduler settings are set automatically. Do not adjust these settings in DSE 6.0 and later. In earlier versions, the default settings were different and might have required tuning.

RAM buffer sizing

The default settings for RAM buffer in dse.yaml are appropriate for:
  • ram_buffer_heap_space_in_mb: 1024
  • ram_buffer_offheap_space_in_mb: 1024

    Because NRT does not use offheap, applies only to RT.

Adjust these settings to configure how much global memory all Solr cores use to accumulate updates before flushing segments. Setting this value too low can induce a state of constant flushing during periods of ongoing write activity. For NRT, these forced segment flushes will also de-schedule pending auto-soft commits to avoid potentially flushing too many small segments.

JMX MBean path: com.datastax.bdp.metrics.search.RamBufferSize

Tuning TPC cores

DSE Search workloads do not benefit from hyper-threading for writes (indexing). To optimize DSE Search for indexing throughput for both modes (NRT and RT), change tpc_cores in cassandra.yaml from the default to the number of physical CPUs. Change this setting only on search nodes, because this change might degrade throughput for workloads other than search.

Disabling AIO or increasing file cache size

If you are experiencing poor performance during search indexing, or during read or write queries of frequently used datasets, there are two tuning choices to consider:
  • Disabling asynchronous I/O (AIO)
  • Increasing file cache size

By default, DataStax Enterprise 6.0 and later use AIO and a custom chunk cache that replaces the OS page cache for SSTable data. This chunk cache is configured to use one-third of the total memory available on a machine, or one-half of an explicitly configured JVM MaxDirectMemorySize. All DataStax Enterprise search index updates first perform a read-before-write against the partition or row being indexed. This functionality means DataStax Enterprise uses the core database's internal read path, which in turn uses the AIO/chunk cache apparatus.

To evaluate potential improvements, work in your test environment and measure the performance of each option separately.

First try disabling AIO by passing -Ddse.io.aio.enabled=false to DataStax Enterprise at startup, and set file_cache_size_in_mb to 512. Once enforced, SSTables and Lucene segments, as well as other minor off-heap elements, will reside in the OS page cache and will be managed by the kernel. Disabling AIO will generate a WARN entry in system.log. For example:
WARN [main] 2019-02-25 21:37:16,563 StartupChecks.java:632 
        - Asynchronous I/O has been manually disabled (through the'dse.io.aio.enabled' system property). 
        This may result in subpar performance.

A potentially negative impact of disabling AIO may be measurably higher read latency when DataStax Enterprise goes to disk, in cases where the dataset is larger than available memory.

Alternatively in your test environment, if the chunk cache is not large enough to hold a reasonable amount of frequently accessed SSTable data, you could leave AIO enabled by default and instead increase file_cache_size_in_mb beyond its default level. The default is calculated as:
(MACHINE_MEMORY) x 0.33
With this approach, you may notice an improvement in terms of mostly-write workloads, but space used by the chunk cache cannot be occupied by Lucene segments.
It may take a few iterations of tuning and testing to find a level that balances the two. The benefit of this approach, if it can be tuned properly, is that your environment retains the advantages of AIO reads. That is, search indexing avoids the context switching around the Java NIO pool of IO threads.
Note: This second option of increasing the allocation of file_cache_size_in_mb will likely not improve performance if your application is already exhausting memory usage between the heap and chunk cache on your machine.

Back pressure

The back_pressure_threshold_per_core in dse.yaml affects only index rebuilding/reindexing. If you upgraded to DSE 6.0 from earlier versions, ensure that you use the new default value of 1024.

Tuning NRT reindexing

DSE Search provides multi-threaded asynchronous indexing with a back pressure mechanism to avoid saturating available memory and to maintain stable performance. Multi-threaded indexing improves performance on machines that have multiple CPU cores.

For reindexing only, the IndexPool MBean provides operational visibility and tuning through JMX.

For NRT only, to maximize NRT throughput during a manual re-index, adjust these settings in the search index config:
  • Increase the soft commit time, which is set to 10 seconds (10000 ms) by default. For example, increase the time to 60 seconds and then reload the search index:
    ALTER SEARCH INDEX CONFIG ON demo.health_data SET autoCommitTime = 60000;
    
    To make the pending changes active:
    RELOAD SEARCH INDEX ON demo.health_data;
A disadvantage of changing the autoSoftCommit attribute is that newly updated rows take longer than usual (10000ms) to appear in search results.

Tuning RT indexing

Live indexing uses more memory but reduces the time for docs to be searchable. Enable live indexing on only one search index per cluster.
  1. To enable live indexing (also known as RT):
    ALTER SEARCH INDEX CONFIG ON demo.health_data SET realtime = true;
  2. To configure live indexing, set the autoCommitTime to a value between 100-1000 ms:
    ALTER SEARCH INDEX CONFIG ON demo.health_data SET autoCommitTime = 1000;
    

    Test with tuning values of 100-1000 ms. An optimal setting in this range depends on your hardware and environment. For live indexing (RT), this refresh interval saturates at 1000 ms. A value higher than 1000 ms is not recognized.

  3. Ensure that search nodes have at least 14 GB heap.
  4. If you change the heap, restart DataStax Enterprise to use live indexing with the changed heap size.