DSE Search query best practices

Best practices for DSE Search queries.

DataStax recommends following these best practices for running queries in DSE Search:
  • Avoid querying nodes that are indexing.

    For responding to queries, DSE Search ranks the nodes that are not performing Solr indexing higher than indexing ones. If nodes that are indexing are the only nodes that can satisfy the query, the query does not fail but instead returns potentially partial results.

  • Use CQL to run Solr queries.

    You can develop CQL-centric applications supporting full-text search without having to work with Solr-specific APIs. Perform all data manipulation with CQL, except for deleting by query.

  • When vnodes are not in use, distributed queries in DSE Search are most efficient when the number of nodes in the queried data center (DC) is a multiple of the replication factor (RF) in that DC.
  • Avoid using too many terms in the query, like this type of query:
    SELECT request_id, store_id FROM store_search.transaction_search WHERE solr_query='{"q":"*:*","shards.failover":true,"shards.tolerant":false,"fq":"store_id:store1a store_id:store2b store_id:store2c ... store_id:store19987d"}'
    Instead, use a terms filter query.
    Note: Limitations and known Apache Solr issues apply to DSE Search queries, including the 1024 maxBoolean clause limit in SOLR-4586.