DSE Search configuration file (solrconfig.xml)

solrconfig.xml is the primary DSE search configuration file.

The solrconfig.xml resource file is the primary configuration file for configuring Solr for use with DSE Search.

You can use custom resources or automatically create resources, including the solrconfig.xml file. The solrconfig.xml resource is persisted in the solr_admin.solr_resources database table.

Reload a Solr core after you modify the solrconfig.xml file. Changes apply only to the node where you reload the core. Do not make schema changes on production systems.

Parameters

You might need to modify the following parameters to tune DSE Search. For full details, see the Apache Solr Reference Guide.

autoSoftCommit 
See Configuring and tuning indexing performance.
directoryFactory 
The directory factory to use for search indexes.
dseAllowTokenizedUniqueKey 
By default, a tokenized unique key is not permitted. To disable tokenized key validation, add the dseAllowTokenizedUniqueKey entry and set to true:
<dseAllowTokenizedUniqueKey>true</dseAllowTokenizedUniqueKey>
dseTypeMappingVersion  
The Solr type mapping version defines how Solr types are mapped to Cassandra Thrift or Cassandra types. Changing a Solr type mapper is rarely if ever done and is not recommended; however, for particular circumstances, such as converting Solr types such as the Solr LongField to TrieLongField, you configure the dseTypeMappingVersion using the force option. See Changing Solr Types and Configuring the Solr type mapping version. Use this option only if you are an expert and have confirmed that the Cassandra internal validation classes of the types involved in the conversion are compatible. You can force a column to change type by using force="true". For example:
<dseTypeMappingVersion force = "true">1</dseTypeMappingVersion>
After changing the type mapping, you must reload the Solr core with re-indexing.
dseUpdateRequestProcessorChain 
You can configure a custom URP to extend the Solr UpdateRequestProcessor. See Field input/output (FIT) transformer API.
enableLazyFieldLoading 
Do not change the default value of true:
<enableLazyFieldLoading>true</enableLazyFieldLoading>
A Solr bug SOLR-8858 in earlier versions of Solr restricted changing this field.
fieldInputTransformer 
The field output transformer API is an option to the input/output transformer support in Solr.
fieldOutputTransformer 
The field output transformer API is an option to the input/output transformer support in Solr. See Field input/output (FIT) transformer API and an Introduction to DSE Field Transformers.
indexConfig 
Parameters for tuning index building and configuring re-indexing:
  • deleteApplicationStrategy - Controls how deleted documents are retrieved while deletes are being applied.
    • seekexact - The safest default setting. Uses bloom filters to avoid reading from most segments and works better when memory is limited and the unique key field data doesn't fit into memory.
    • seekceiling - More performant. Can be faster, especially when documents are deleted/inserted into the database with sequential keys. This strategy can stop reading from segments where it knows terms can no longer appear.
  • mergedSegmentWarmer - To use warmup segments in DSE Search:
    <mergedSegmentWarmer class="com.datastax.bdp.search.solr.core.TokenSegmentWarmer"/>
  • parallelDeleteTasks - Supported for RT and NRT indexing. Specify a positive number > 0 and defaults to number of available processors. This parameter regulates how many tasks are created to apply deletes during soft/hard commit in parallel.

    Leave parallelDeleteTasks at the default value, except when issues occur with write load when running a mixed read/write workload. If writes occasionally spike in utilization and negatively impact your read performance, then set this value lower. To prevent writes from overwhelming reads, reduce this value and max_solr_concurrency_per_core in dse.yaml.

  • ramBufferSizeMB - See Configuring and tuning indexing performance.
lib 
The location for library files in DataStax Enterprise is not the same location as open source Solr. Contrary to the examples shown in the solrconfig.xml file that indicate support for relative paths, DataStax Enterprise does not support the relative path values that are set for the <lib> property. DSE Search fails to find files in directories that are defined by the <lib> property. The workaround is to place custom code or Solr contrib modules in the Solr library directories. See Configuring the Solr library path.
maxBooleanClauses 
Defines the maximum number of clauses in a boolean query. You must change this value on all cores and then restart the nodes to make the change effective. See Changing maxBooleanClauses.
mergeScheduler 
The default mergeScheduler settings are not appropriate for DSE Search near real time (NRT) indexing production use on a typical size server. DataStax recommends these settings as a starting point, and then adjust as appropriate to your environment:
  • maxThreadCount = to the number of CPU cores divided by 2
  • maxMergeCount = maxThreadCount * 2
For example, for 24 CPU cores:
<indexConfig>
              ...
              <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
              <int name="maxThreadCount">12</int>
              <int name="maxMergeCount">24</int>
              </mergeScheduler>
              ...
queryExecutorThreads 
You can set the query executor threads parameter in the solrconfig.xml file to enable multi-threading for filter queries, normal queries, and doc values facets:
<queryExecutorThreads>4</queryExecutorThreads>
See Configuring multi-threaded queries.
queryResponseWriter 
For performance, you can configure DSE Search to parallelize the retrieval of a large number of rows.
<queryResponseWriter name="javabin" class="solr.BinaryResponseWriter">
              <str name="resolverFactory">com.datastax.bdp.search.solr.response.ParallelRowResolver$Factory</str>
              </queryResponseWriter>
See Parallelizing large Cassandra row reads.
ramBufferSizeMB 
The default value is 512 MB. See Configuring and tuning indexing performance.
requestHandler 
The correct search handler is required for CQL Solr queries in DSE Search.

When you automatically generate resources, the solrconfig.xml file already contains the request handler for running CQL Solr queries in DSE Search. If you do not automatically generate resources and want to run CQL Solr queries using custom resources, the CqlSearchHandler handler is automatically inserted:

<requestHandler class="com.datastax.bdp.search.solr.handler.component.CqlSearchHandler" name="solr_query" />

For recommendations for the basic configuration for the search handler, and an example that shows adding a search component, see Configuring search components.

In this example, to configure the Data Import Handler, you can add a request handler element that contains the location of data-config.xml and data source connection information.

rt 

To enable live indexing (also known as RT), add <rt>true</rt> to the <indexConfig> attribute.

<indexConfig>
              <rt>true</rt>
              ...
See Configuring and tuning indexing performance.
SolrFilterCache 
The DSE Search configurable filter cache, SolrFilterCache, can reliably bound the filter cache memory usage for a Solr core. This implementation contrasts with the default Solr implementation which defines bounds for filter cache usage per segment. See Configuring filter cache for searching.
updateHandler 
You can configure per-document or per-field TTL. See Expiring a DSE Search column.