DSE Search configuration file (solrconfig.xml)
solrconfig.xml is the primary DSE search configuration file.
- For DataStax Enterprise configuration, see dse.yaml configuration file.
- For node and cluster configuration, see cassandra.yaml.
You can use custom resources or automatically create resources, including the solrconfig.xml file. The solrconfig.xml resource is persisted in the solr_admin.solr_resources database table.
Reload a Solr core after you modify the solrconfig.xml file. Changes apply only to the node where you reload the core.
Parameters
To tune DSE Search, you can modify the following parameters. For full details, see the Apache Solr Reference Guide.
- autoSoftCommit
- For live indexing, ensure that the autoSoftCommit time is 100ms.
See Configuring and tuning indexing performance.<autoSoftCommit> <maxTime>100</maxTime> </autoSoftCommit>
- directoryFactory
- The directory factory to use for search indexes. Encryption is enabled per core. To
enable encryption for each core, change the class
for directoryFactory to
EncryptedFSDirectoryFactory
.
- dseAllowTokenizedUniqueKey
- By default, a tokenized unique key is not permitted. To disable tokenized key
validation, add the dseAllowTokenizedUniqueKey entry and set to
true:
<dseAllowTokenizedUniqueKey>true</dseAllowTokenizedUniqueKey>
- dseTypeMappingVersion
- The Solr type mapping version defines how Solr types are mapped to Cassandra Thrift or
Cassandra types. Changing a Solr type mapper is rarely required and is not recommended;
however, for particular circumstances, such as converting Solr types such as the Solr LongField to TrieLongField, you configure the
dseTypeMappingVersion using the force option. See Changing Solr types and
Configuring the Solr type mapping version. Use this option only if you are an
expert and have confirmed that the Cassandra internal validation classes of the types
involved in the conversion are compatible. To change type, use force="true":
After changing the type mapping, you must reload the Solr core with reindexing.<dseTypeMappingVersion force = "true">1</dseTypeMappingVersion>
- dseUpdateRequestProcessorChain
- Configure a custom URP to extend the Solr UpdateRequestProcessor. See Field input/output (FIT) transformer API.
- enableLazyFieldLoading
- Do not change the default value of
true:
A Solr bug SOLR-8858 in earlier versions of Solr restricted changing this field.<enableLazyFieldLoading>true</enableLazyFieldLoading>
- fieldInputTransformer
- The field input transformer API is an option to the input/output transformer support in Solr.
- fieldOutputTransformer
- The field output transformer API is an option to the input/output transformer support in Solr. See Field input/output (FIT) transformer API and the dev blog post an Introduction to DSE Field Transformers.
- filtercache
- The DSE Search configurable filter cache reliably bounds the filter cache memory usage
for a Solr core. This implementation contrasts with the default Solr implementation
which defines bounds for filter cache usage per segment. SolrFilterCache bounding works
by evicting cache entries after the configured per core high watermark is reached, and
stopping after the configured lower watermark is reached.
SolrFilterCache defaults to offheap. In general, the larger the index is, then the larger the filter cache should be. A good default is 1 to 2 GB. If the index is 1 billion docs per node, then set to 4 to 5 GB. See Collecting cache statistics.
Set the class attribute of thefilterCache
element tosolr.SolrFilterCache
and define the low and high watermark for cache eviction:<filterCache class="solr.SolrFilterCache" lowWaterMarkMB="1024" highWaterMarkMB="2048" />
Note: SolrFilterCache does not support auto-warming.
- indexConfig
- Parameters for tuning index building and configuring re-indexing:
- deleteApplicationStrategy - Controls how deleted
documents are retrieved while deletes are being applied.
- seekexact - The safest default setting. Uses bloom filters to avoid reading from most segments and works better when memory is limited and the unique key field data doesn't fit into memory.
- seekceiling - More performant. Can be faster, especially when documents are deleted/inserted into the database with sequential keys. This strategy stops reading from segments where it knows terms can no longer appear.
- mergedSegmentWarmer - To use warmup segments in DSE
Search:
<mergedSegmentWarmer class="com.datastax.bdp.search.solr.core.TokenSegmentWarmer"/>
- parallelDeleteTasks - Regulates how many tasks are
created to apply deletes during soft/hard commit in parallel. Supported for RT and
NRT indexing. Specify a positive number greater than 0. The default value is the
number of available processors.
Leave parallelDeleteTasks at the default value, except when issues occur with write load when running a mixed read/write workload. If writes occasionally spike in utilization and negatively impact your read performance, then set this value lower. To prevent writes from overwhelming reads, reduce this value and max_solr_concurrency_per_core in dse.yaml.
- ramBufferSizeMB - Change the size of the RAM buffer and increase the soft commit time for Configuring and tuning indexing performance.
- deleteApplicationStrategy - Controls how deleted
documents are retrieved while deletes are being applied.
- lib
- The location for library files in DataStax Enterprise is not the same location as open source Solr. Contrary to the examples shown in the solrconfig.xml file that indicate support for relative paths, DataStax Enterprise does not support the relative path values that are set for the <lib> property. DSE Search fails to find files in directories that are defined by the <lib> property. The workaround is to place custom code or Solr contrib modules in the Solr library directories. See Configuring the Solr library path.
- mergeScheduler
- In releases earlier than DSE 5.0.8, the default mergeScheduler settings are not appropriate for DSE Search near real time (NRT) indexing production use on a typical size server. See Configuring and tuning indexing performance.
- queryExecutorThreads
- You can set the query executor threads parameter in the
solrconfig.xml file to enable multi-threading for filter queries,
normal queries, and doc values
facets:
See Configuring multi-threaded queries.<queryExecutorThreads>4</queryExecutorThreads>
- queryResponseWriter
- For performance, you can configure DSE Search to parallelize the retrieval of a large
number of rows.
See Parallelizing large Cassandra row reads.<queryResponseWriter name="javabin" class="solr.BinaryResponseWriter"> <str name="resolverFactory">com.datastax.bdp.search.solr.response.ParallelRowResolver$Factory</str> </queryResponseWriter>
- ramBufferSizeMB
- The default value is 512 MB.
- To tune for performance, see Configuring and tuning indexing performance.
- To configure live indexing, increase the RAM buffer size to 1000 MB.
See Configuring and tuning indexing performance.<ramBufferSizeMB>1000</ramBufferSizeMB>
- requestHandler
- The correct search handler is required for CQL Solr queries in DSE Search.
When you automatically generate resources, the solrconfig.xml file already contains the request handler for running CQL Solr queries in DSE Search. To run CQL Solr queries using custom resources, the CqlSearchHandler handler is automatically injected:
<requestHandler class="com.datastax.bdp.search.solr.handler.component.CqlSearchHandler" name="solr_query" />
For recommendations for the basic configuration for the search handler, and an example that shows adding a search component, see Configuring additional search components.
In this example, to configure the Data Import Handler, you can add a request handler element that contains the location of data-config.xml and data source connection information.
For use with the HTTP API only, you can define the default number of rows in the solrconfig.xml file:<requestHandler name="search" class="solr.SearchHandler" default="true"> <lst name="defaults"> <int name="rows">10</int> </lst> </requestHandler>
- rt
To enable live indexing (also known as RT), add
See Configuring and tuning indexing performance.<rt>true</rt>
to the<indexConfig>
attribute and configure the options.
- updateHandler
- You can configure per-document or per-field TTL. See Expiring a DSE Search column.