Parallelizing large row reads

For performance, you can configure DSE Search to parallelize the retrieval of a large number of rows.Configure the queryResponseWriter in the search index as follows:

<queryResponseWriter name="javabin" class="solr.BinaryResponseWriter">
  <str name="resolverFactory">com.datastax.bdp.search.solr.response.ParallelRowResolver$Factory</str>
</queryResponseWriter>

By default, the parallel row resolver uses up to x threads to execute parallel reads, where x is the number of CPUs. Each thread sequentially reads a batch of rows equal to the total requested rows divided by the number of CPUs:

Rows read = Total requested rows / Number of CPUs

You can change the batch size per request, by specifying the cassandra.readBatchSize HTTP request parameter. Smaller batches use more parallelism, while larger batches use less.

Parallelizing large row reads

Was this helpful?

Give Feedback