Handling inconsistencies in query results

Consider session stickiness, subrange node repair, and follow best practices for soft commit points on different replica nodes.

DSE Search implements an efficient, highly available distributed search algorithm on top of Cassandra, which tries to select the minimum number of replica nodes required to cover all token ranges, and also avoid hot spots. Consequently, due to the eventually consistent nature of Cassandra, some replica nodes might not have received or might not have indexed the latest updates yet. This situation might cause DSE Search to return inconsistent results (different numFound counts) between queries due to different replica node selections. This behavior is intrinsic to how highly available distributed systems work, as described in the ACM article, "Eventually Consistent" by Werner Vogels. Most of the time, eventual consistency is not an issue, yet DSE Search implements session stickiness to guarantee that consecutive queries will hit the same set of nodes on an healthy, stable cluster, hence providing monotonic results. Session stickiness works by adding a session seed to request parameters as follows:

shard.shuffling.strategy=SEED
shard.shuffling.seed=<session id>

In the event of unstable clusters with missed updates due to failures or network partitions, consistent results can be achieved by repairing nodes using the subrange repair method.

Finally, another minor source of inconsistencies is caused by different soft commit points on different replica nodes: A given item might be indexed and committed on a given node, but not yet on its replica. This situation is primarily a function of the load on each node, hence DataStax recommends the following practices:

  • Evenly balancing read/write load between nodes
  • Properly tuning soft commit time and async indexing concurrency
  • Configuring back pressure in the dse.yaml file