Search index config
Reference information to change query behavior for search indexes.
- DataStax recommends CQL CREATE SEARCH INDEX and ALTER SEARCH INDEX CONFIG commands.
- dsetool commands can also be used to manage search indexes.
Changing search index config
- Create a search index. For example:
CREATE SEARCH INDEX ON demo.health_data;
- Alter the search index. For
example:
ALTER SEARCH INDEX CONFIG ON demo.health_data SET autoCommitTime = 30000;
- Optionally view the XML of the pending search index. For
example:
DESCRIBE PENDING SEARCH INDEX CONFIG on demo.health_data;
- Make the pending changes active. For
example:
RELOAD SEARCH INDEX ON demo.health_data;
Sample search index config
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<config>
<abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError>
<luceneMatchVersion>LUCENE_6_0_0</luceneMatchVersion>
<dseTypeMappingVersion>2</dseTypeMappingVersion>
<directoryFactory class="solr.StandardDirectoryFactory" name="DirectoryFactory"/>
<indexConfig>
<rt>false</rt>
<rtOffheapPostings>true</rtOffheapPostings>
<useCompoundFile>false</useCompoundFile>
<reopenReaders>true</reopenReaders>
<deletionPolicy class="solr.SolrDeletionPolicy">
<str name="maxCommitsToKeep">1</str>
<str name="maxOptimizedCommitsToKeep">0</str>
</deletionPolicy>
<infoStream file="INFOSTREAM.txt">false</infoStream>
</indexConfig>
<jmx/>
<updateHandler class="solr.DirectUpdateHandler2">
<autoSoftCommit>
<maxTime>10000</maxTime>
</autoSoftCommit>
</updateHandler>
<query>
<maxBooleanClauses>1024</maxBooleanClauses>
<filterCache class="solr.SolrFilterCache" highWaterMarkMB="2048" lowWaterMarkMB="1024"/>
<enableLazyFieldLoading>true</enableLazyFieldLoading>
<useColdSearcher>true</useColdSearcher>
<maxWarmingSearchers>16</maxWarmingSearchers>
</query>
<requestDispatcher handleSelect="true">
<requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000"/>
<httpCaching never304="true"/>
</requestDispatcher>
<requestHandler class="solr.SearchHandler" default="true" name="search">
<lst name="defaults">
<int name="rows">10</int>
</lst>
</requestHandler>
<requestHandler class="com.datastax.bdp.search.solr.handler.component.CqlSearchHandler" name="solr_query">
<lst name="defaults">
<int name="rows">10</int>
</lst>
</requestHandler>
<requestHandler class="solr.UpdateRequestHandler" name="/update"/>
<requestHandler class="solr.UpdateRequestHandler" name="/update/csv" startup="lazy"/>
<requestHandler class="solr.UpdateRequestHandler" name="/update/json" startup="lazy"/>
<requestHandler class="solr.FieldAnalysisRequestHandler" name="/analysis/field" startup="lazy"/>
<requestHandler class="solr.DocumentAnalysisRequestHandler" name="/analysis/document" startup="lazy"/>
<requestHandler class="solr.admin.AdminHandlers" name="/admin/"/>
<requestHandler class="solr.PingRequestHandler" name="/admin/ping">
<lst name="invariants">
<str name="qt">search</str>
<str name="q">solrpingquery</str>
</lst>
<lst name="defaults">
<str name="echoParams">all</str>
</lst>
</requestHandler>
<requestHandler class="solr.DumpRequestHandler" name="/debug/dump">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="echoHandler">true</str>
</lst>
</requestHandler>
<admin>
<defaultQuery>*:*</defaultQuery>
</admin>
</config>
Configuration elements are listed alphabetically by shortcut. The XML element is shown with the element start tag. An ellipsis indicates that other elements or attributes are not shown.
- autoCommitTime
- Defines the time interval between updates to the search index with the most recent
data after an INSERT, UPDATE, or DELETE. By default, changes are automatically committed
every 10000 milliseconds. To change the time interval between updates:
- Set auto commit time on the pending search index:
ALTER SEARCH INDEX CONFIG ON wiki.solr SET autoCommitTime = 30000;
- You can view the pending search
config:
DESCRIBE PENDING SEARCH INDEX CONFIG on wiki.solr;
The resulting XML shows the maximum time between updates is 30000 milliseconds:<updateHandler class="solr.DirectUpdateHandler2"> <autoSoftCommit> <maxTime>30000</maxTime> </autoSoftCommit> </updateHandler>
- To make the pending changes active, reload the search
index:
RELOAD SEARCH INDEX ON wiki.solr;
- Set auto commit time on the pending search index:
- defaultQueryField
- Name of the default field to query. Default not set. To set the field to use when no field is specified by the query, see .
- directoryFactory
- The directory factory to use for search indexes. Encryption is enabled per search
index. To enable encryption for a search index,
change the class for directoryFactory to
EncryptedFSDirectoryFactory
.- Enable encryption on the pending search index:
ALTER SEARCH INDEX CONFIG ON wiki.solr SET directoryFactory = EncryptedFSDirectoryFactory;
- You can view the pending search
config:
DESCRIBE PENDING SEARCH INDEX CONFIG on wiki.solr;
The resulting XML shows that encryption is enabled:<directoryFactory class="solr.EncryptedFSDirectoryFactory" name="DirectoryFactory"/>
- To make the pending changes active, reload the search
index:
RELOAD SEARCH INDEX ON wiki.solr;
- Enable encryption on the pending search index:
- filterCacheLowWaterMark
- Default is 1024 MB. See below.
- filterCacheHighWaterMark
- Default is 2048 MB.
- mergeFactor
- When a new segment causes the number of lowest-level segments to exceed the merge
factor value, then those segments are merged together to form a single large segment.
When the merge factor is 10, each merge results in the creation of a single segment that
is about ten times larger than each of its ten constituents. When there are 10 of these
larger segments, then they in turn are merged into an even larger single segment.
Default is 10.
- To change the number of segments to merge at one
time:
ALTER SEARCH INDEX CONFIG ON solr.wiki SET mergeFactor = 5;
- View the pending search index
config:
<indexConfig> ... <mergeFactor>10</mergeFactor> ... </indexConfig>
- To make the pending changes active, reload the search
index:
RELOAD SEARCH INDEX ON wiki.solr;
- To change the number of segments to merge at one
time:
- mergeMaxThreadCount
- Must configure with mergeMaxMergeCount. The number of concurrent merges that Lucene
can perform for the search index. The default mergeScheduler settings are set
automatically. Do not adjust this setting.
Default: ½ the number of tpc_cores
- mergeMaxMergeCount
- Must configure with mergeMaxThreadCount. The number of pending merges (active and in
the backlog) that can accumulate before segment merging starts to block/throttle
incoming writes. The default mergeScheduler settings are set automatically. Do not
adjust this setting.
Default: 2x the mergeMaxThreadCount
- ramBufferSize
- The index RAM buffer size in megabytes (MB). The RAM buffer holds uncommitted
documents. A larger RAM buffer reduces flushes. Segments are also larger when flushed.
Fewer flushes reduces I/O pressure which is ideal for higher write workload scenarios.
For example, adjust the ramBufferSize when you configure live indexing:
ALTER SEARCH INDEX CONFIG ON wiki.solr SET autoCommitTime = 100; ALTER SEARCH INDEX CONFIG ON wiki.solr SET realtime = true; ALTER SEARCH INDEX CONFIG ON wiki.solr SET ramBufferSize = 2048; RELOAD SEARCH INDEX ON wiki.solr ;
Default: 512
- realtime
- Enables live indexing to increase indexing throughput. Enable live indexing on only
one node per cluster. Live indexing, also called real-time (RT) indexing, supports
searching directly against the Lucene RAM buffer and more frequent, cheaper
soft-commits, which provide earlier visibility to newly indexed data.
Live indexing requires a larger RAM buffer and more memory usage than an otherwise equivalent NRT setup. See .
Configuration elements without shortcuts
To specify configuration elements that do not have shortcuts, you can specify the XML path to the setting and separate child elements using a period.
- deleteApplicationStrategy
- Controls how to retrieve deleted documents when deletes are being applied. Seek exact
is the safe default most people should choose, but for a little extra performance you
can try seekceiling.Valid case-insensitive values are:
- seekexact
Uses bloom filters to avoid reading from most segments. Use when memory is limited and the unique key field data does not fit into memory.
- seekceiling
More performant when documents are deleted/inserted into the database with sequential keys, because this strategy can stop reading from segments when it is known that terms can no longer appear.
Default: seekexact
- seekexact
- mergePolicyFactory
- The AutoExpungeDeletesTieredMergePolicy custom merge policy
is based on TieredMergePolicy. This policy cleans up the large segments by merging them
when deletes reach the percentage threshold. A single auto expunge merge occurs at a
time. Use for large indexes that are not merging the largest segments due to deletes. To
determine whether this merge setting is appropriate for your workflow, view the segments
on the Solr Segment Info screen. When set, the XML is described as:
<indexConfig> <mergePolicyFactory class="org.apache.solr.index.AutoExpungeDeletesTieredMergePolicyFactory"> <int name="maxMergedSegmentMB">5000</int> <int name="forceMergeDeletesPctAllowed">25</int> <bool name="mergeSingleSegments">true</bool> </mergePolicyFactory> </indexConfig>
To extend TieredMergePolicy to support automatic removal of deletes:- To enable automatic removal of deletes, set the custom
policy:
ALTER SEARCH INDEX CONFIG ON wiki.solr SET indexConfig.mergePolicyFactory[@class='org.apache.solr.index.AutoExpungeDeletesTieredMergePolicyFactory'].bool[@name='mergeSingleSegments'] = true;
- Set the maximum segment size in
MB:
ALTER SEARCH INDEX CONFIG ON wiki.solr SET indexConfig.mergePolicyFactory[@class='org.apache.solr.index.AutoExpungeDeletesTieredMergePolicyFactory'].int[@name='maxMergedSegmentMB'] = 5000;
- Set the percentage threshold for deleting from the large
segments:
ALTER SEARCH INDEX CONFIG ON wiki.solr SET indexConfig.mergePolicyFactory[@class='org.apache.solr.index.AutoExpungeDeletesTieredMergePolicyFactory'].int[@name='forceMergeDeletesPctAllowed'] = 25;
If mergeFactor is in the existing index config, you must drop it from the search index before you alter the table to support automatic removal of deletes:ALTER SEARCH INDEX CONFIG ON wiki.solr DROP indexConfig.mergePolicyFactory;
- To enable automatic removal of deletes, set the custom
policy:
- parallelDeleteTasks
- Regulates how many tasks are created to apply deletes during soft/hard commit in
parallel. Supported for RT and NRT indexing. Specify a positive number greater than 0.
Leave parallelDeleteTasks at the default value, except when issues occur with write load when running a mixed read/write workload. If writes occasionally spike in utilization and negatively impact your read performance, then set this value lower.
Default: the number of available processors