Handling inconsistencies in query results 

Consider session stickiness, subrange node repair, and follow best practices for soft commit points on different replica nodes.

DSE Search implements an efficient, highly available distributed search algorithm on top of Cassandra, which tries to select the minimum number of replica nodes required to cover all token ranges, and also avoid hot spots. Consequently, due to the eventually consistent nature of Cassandra, some replica nodes might not have received or might not have indexed the latest updates yet. This situation might cause DSE Search to return inconsistent results (different numFound counts) between queries due to different replica node selections. This behavior is intrinsic to how highly available distributed systems work, as described in the ACM article, "Eventually Consistent" by Werner Vogels. Most of the time, eventual consistency is not an issue, yet DSE Search implements session stickiness to guarantee that consecutive queries will hit the same set of nodes on a healthy, stable cluster, hence providing monotonic results. Session stickiness works by adding a session seed to request parameters as follows:

shard.shuffling.seed=<session id>

In the event of unstable clusters with missed updates due to failures or network partitions, consistent results can be achieved by repairing nodes using the subrange repair method.

Finally, another minor source of inconsistencies is caused by different soft commit points on different replica nodes: A given item might be indexed and committed on a given node, but not yet on its replica. This situation is primarily a function of the load on each node. Implement the following best practices:

  • Evenly balance read/write load between nodes
  • Properly tune soft commit time and async indexing concurrency
  • Configure back pressure in the dse.yaml file

About back pressure

To maximize insert throughput, DSE/Solr buffers insert requests from Cassandra so that application insert requests can be acknowledged as quickly as possible. However, if too many requests accumulate in the buffer (a configurable setting), DSE/Solr pauses or blocks incoming requests until DSE/Solr catches up with the buffered requests. In extreme cases, that pause causes a timeout to the application. See "Multi-threaded indexing in DSE Search."