Live indexing enables queries to be made against recently
indexed data. Live indexing, also known as RT (real time) indexing, improves index
throughput and reduces Lucene reader latency while supporting all Solr functionality.
Live indexing enables queries to be made against recently
indexed data. Live indexing, also known as RT (real time) indexing, improves index
throughput and reduces Lucene reader latency while supporting all Solr functionality.
Live indexing works for all DSE Search
applications. Fields that are sorted on must be docvalues, otherwise the field cache is
used and is inefficient with live indexing.
The location of the
dse.yaml file depends on the
type of installation:
Installer-Services |
/etc/dse/dse.yaml |
Package installations |
/etc/dse/dse.yaml |
Installer-No Services |
install_location/resources/dse/conf/dse.yaml |
Tarball installations |
install_location/resources/dse/conf/dse.yaml |
Procedure
-
Enable live indexing on only one Solr core per cluster.
-
To enable live indexing (also known as RT), add
<rt>true</rt>
to the <indexConfig>
attribute of the solrconfig.xml file.
-
To configure live indexing, edit the solrconfig.xml file
and increase the RAM buffer size and ensure that the autoSoftCommit time is
100ms:
<ramBufferSizeMB>2000</ramBufferSizeMB>
...
<autoSoftCommit>
<maxTime>100</maxTime>
</autoSoftCommit>
The
larger RAM buffer enables faster indexing.
-
Increase the heap size. For live indexing, DataStax recommends a heap
size of at least 20 GB for use with Java 1.8 and G1GC. A larger heap
size allows you to allocate more RAM buffer size, which contributes to
faster live (RT) indexing. Enable live indexing on only one Solr core
per cluster.
-
Set the value of the max_solr_concurrency_per_core in the
file. In the dse.yaml file, define the number of
buffered asynchronous index updates per Solr core before the back-pressure is activated
with the back_pressure_threshold_per_core option. The default value
is 1000 times the number of available CPU cores.
-
Restart DataStax Enterprise to use live indexing with the increased heap
size.
- Optional:
To filter a given range query:
_query_:"{!rtrange}tint:[0 TO 5}" OR _query_:"{!rtrange}tint:[-10 TO -5}"