Shard routing for distributed queries
On DSE Search nodes, the shard selection algorithm for distributed queries uses a series of criteria to route sub-queries to the nodes most capable of handling them. The best node is determined by a chain of node comparisons. Selection occurs in the following order using these criteria:
-
Is node active?
Preference to active nodes.
-
Is the requested core indexing, or has it failed to index?
If node is not in either of these states, select the best node.
-
Node health rank, an exponentially increasing number between 0 and 1. It describes the health node so if all the previous criteria is equal, a node with a better score is chosen first. This node health rank value is exposed as a JMX metrics under ShardRouter.
Node health rank is a comparison of uptime and dropped mutations:
node health = <uptime> / (1 + <drop_rate>)
where:
-
drop_rate = the rate of dropped mutations per minute over a sliding window of configurable length.
To configure the historic time window, set dropped_mutation_window_minutes in
dse.yaml
.A high-dropped mutation rate indicates an overloaded node. For example, database insertions, and updates.
-
uptime = a score between 0 and 1 that weights recent downtime more heavily than less recent downtime.
-
-
Is the node close to the node that is issuing the query?
Node selection uses endpoint snitch proximity. Give preference to closer nodes.
After using these criteria, node selection is random.
To check on the shard router, add shards.info=true
to the search query.
The ShardRouter MBean, not present in open source Solr, provides information about how DSE search routes queries.