Parallelizing large row reads
For performance, you can configure DSE Search to parallelize the retrieval of a large number of rows.Configure
the queryResponseWriter
in the search index as follows:
<queryResponseWriter name="javabin" class="solr.BinaryResponseWriter">
<str name="resolverFactory">com.datastax.bdp.search.solr.response.ParallelRowResolver$Factory</str>
</queryResponseWriter>
By default, the parallel row resolver uses up to x threads to execute parallel reads, where x is the number of CPUs. Each thread sequentially reads a batch of rows equal to the total requested rows divided by the number of CPUs:
Rows read = Total requested rows / Number of CPUs
You can change the batch size per request, by specifying the cassandra.readBatchSize
HTTP request parameter.
Smaller batches use more parallelism, while larger batches use less.
Was this helpful?
Thank you for your
feedback.