DSE Search query best practices

Best practices for DSE Search queries.

DataStax recommends following these best practices for running queries in DSE Search:
  • Avoid querying nodes that are indexing.

    For responding to queries, DSE Search ranks the nodes that are not performing Solr indexing higher than indexing ones. If nodes that are indexing are the only nodes that can satisfy the query, the query does not fail but instead returns potentially partial results.

  • Use CQL to run Solr queries.

    You can develop CQL-centric applications supporting full-text search without having to work with Solr-specific APIs. Perform all data manipulation with CQL, except for deleting by query.

  • Avoid using too many terms in the query, like this type of query:
    SELECT request_id, store_id FROM store_search.transaction_search WHERE solr_query='{"q":"*:*","shards.failover":true,"shards.tolerant":false,"fq":"store_id:store1a store_id:store2b store_id:store2c ... store_id:store19987d"}'
    Instead, use a terms filter query.
  • When the goal is to write collections, seldom or never update collections, and then focus on query latency, DataStax recommends frozen collections over non-frozen collections.
    For example:
    CREATE TABLE foo (id text, values frozen<set<text>>, PRIMARY KEY (id))
    CREATE TYPE name (first text, last text)
    CREATE TABLE tableWithList (id text, names frozen<list<frozen<name>>>, PRIMARY KEY (id))
  • Distributed queries in DSE Search are most efficient when the number of nodes in the queried data center (DC) is a multiple of the replication factor (RF) in that DC.