Parallelizing large Cassandra row reads

Configure DSE Search/Solr to parallelize the retrieval of a large number of rows to improve performance.

For performance, you can configure DSE Search/Solr to parallelize the retrieval of a large number of rows. First, configure the queryResponseWriter in the solrconfig.xml as follows:

<queryResponseWriter name="javabin" class="solr.BinaryResponseWriter">
  <str name="resolverFactory">com.datastax.bdp.search.solr.response.ParallelRowResolver$Factory</str>
</queryResponseWriter>

By default, the parallel row resolver uses up to x threads to execute parallel reads, where x is the number of CPUs. Each thread sequentially reads a batch of rows equal to the total requested rows divided by the number of CPUs:

Rows read = Total requested rows / Number of CPUs

You can change the batch size per request, by specifying the cassandra.readBatchSize HTTP request parameter. Smaller batches use more parallelism, while larger batches use less.