Parallelizing large row reads
For performance, you can configure DSE Search to parallelize the retrieval of a large number of rows.Configure
the queryResponseWriter
in the search index as follows:
<queryResponseWriter name="javabin" class="solr.BinaryResponseWriter">
<str name="resolverFactory">com.datastax.bdp.search.solr.response.ParallelRowResolver$Factory</str>
</queryResponseWriter>
By default, the parallel row resolver uses up to x threads to execute parallel reads, where x is the number of CPUs. Each thread sequentially reads a batch of rows equal to the total requested rows divided by the number of CPUs:
Rows read = Total requested rows / Number of CPUs
You can change the batch size per request, by specifying the cassandra.readBatchSize
HTTP request parameter.
Smaller batches use more parallelism, while larger batches use less.