Shuffling shards to balance the load
Strategies for load balancing.
To balance the load in a distributed environment, you can choose from several strategies for shuffling the shards to suit your needs. The shard shuffling strategy specifies how one node is selected over others for reading the Solr data. There are several methods for selecting the strategy. All methods involve setting the shard.shuffling.strategy parameter to one of the following values:
Possible values of shard.shuffling.strategy
- host
Shards are selected based on the host that received the query
- query
Shards are selected based on the query string.
- host_query
Shards are selected by host x query.
- random
Different random set of shards are selected with each request (default).
- SEED
Selects the same shard from one query to another.
Methods for selecting shard shuffling strategy
- Append
shard.shuffling.strategy = <strategy>
to the HTTP API query. For example:http://localhost:8983/solr/wiki.solr/select?q=title:natio*&shard.shuffling.strategy=host
Issuing this query determines the shard shuffling strategy for this query only.
- Create a dse-search.properties file and POST it to Solr.
For example:
- Create the dse-search.properties file having the following
contents:
shard.shuffling.strategy=query
- Post the command to DSE Search/Solr. For
example:
curl -v --data-binary @dse-search.properties http://localhost:8983/solr/resource/wiki.solr/dse-search.properties
Posting the command determines the shard shuffling strategy for all queries to the given Solr core. The strategy is propagated to all nodes and saved in Solr core metadata.
- Create the dse-search.properties file having the following
contents:
-
Set the following parameters to use the SEED strategy:
- Pass the
shard.shuffling.strategy=SEED
as a request parameter. - Specify a request parameter, such as an IP address or any string, using the
shard.shuffling.seed
parameter. When you reuse the same seed value between queries on a stable cluster, the same shard strategy will be in effect.Every time you pass the same string, the same list of shards is queried, regardless of the target node you actually query; if you change the string, a different list of shards are queried.
- Verify that the strategy was maintained by passing the
shards.info=true request
parameter. For example:curl "http://localhost:8983/solr/demo.solr/select/?q=text:search&shards.info=true&shard.shuffling.strategy=SEED&shard.shuffling.seed=192.168.0.1&rows=0"
- Pass the
Shuffling does not always result in the node selection you might expect. For example, using a replication factor of 3 with six nodes, the best and only solution is a two-shard solution--half of the data read from the originator node and half from another node. A three-shard solution would be inefficient.