Collecting slow queries

Steps to identify slow queries on a cluster to easily find and tune poorly performing queries using the DataStax Enterprise Performance Service.

The node_slow_log table collects information about slow queries on a node and retains query information of long-running CQL statements to help you identify slow queries on a cluster to find and tune poorly performing queries. Collecting slow query logs is enabled by default. You can also use OpsCenter to view, troubleshoot, and trace Slow Queries.

Pending inserts into the node_slow_log might still be processed after the service has been disabled. You can enable and disable the service. After the service is disabled, the logging of queries that take longer than the specified threshold is stopped. However, disabling the logging does not flush the pending write queue, a background thread eventually processes everything.

The location of the dse.yaml file depends on the type of installation:
Installer-Services /etc/dse/dse.yaml
Package installations /etc/dse/dse.yaml
Installer-No Services install_location/resources/dse/conf/dse.yaml
Tarball installations install_location/resources/dse/conf/dse.yaml

Procedure

  1. By default, collection is enabled for statements that are issued when the query exceeds a specified time threshold.
    • To permanently enable collecting information on slow queries, edit the dse.yaml file and uncomment the cql_slow_log_options parameters, and define values for the CQL slow log settings:
      cql_slow_log_options
      cql_slow_log_options:
          enabled: true
          threshold: 2000.0
          minimum_samples: 100
          ttl_seconds: 259200
      Report distributed sub-queries for Solr (query executions on individual shards) that take longer than a specified period of time. See Collecting slow queries.
    • To temporarily disable collecting information on slow queries that exceeded the threshold of 2000 milliseconds:
      $ dsetool perf cqlslowlog disable
      
    • To temporarily change the threshold to collect information on 5% of the slowest queries:
      $ dsetool perf cqlslowlog 95.0
      After you collect information using this temporarily set threshold, you can run a script to view queries that took longer with this threshold than the previously set threshold. For example:
      $ cqlsh 
      cqlsh> use dse_perf; 
      cqlsh:dse_perf> select * from node_slow_log;
      ...
  2. You can export slow queries using the CQL copy command:
    cqlsh:dse_perf> COPY node_slow_log ( date, commands, duration ) TO 'slow_queries.csv' WITH HEADER = true;