Collecting table histogram diagnostics

Steps to enable collecting table histogram diagnostics using the DataStax Enterprise Performance Service.

The following histogram diagnostics tables collect histogram data at a table level:
  • cell_count_histograms

    Cell count per partition.

  • partition_size_histograms

    Partition size.

  • read_latency_histograms

    Read latency.

  • sstables_per_read_histograms

    SSTables per read.

  • write_latency_histograms

    Write latency.

Note: These tables somewhat duplicate the information obtained by the nodetool tablehistograms utility. The major difference is that tablehistograms output is recent data, whereas the diagnostic tables contain lifetime data. Additionally, each time nodetool tablehistograms is run for a column family, the histogram values are reset; whereas the data in the diagnostic histogram tables are not.

Procedure

To enable the collection of table histogram data using the DataStax Enterprise Performance Service:

  1. Edit the dse.yaml file.
    The location of the dse.yaml file depends on the type of installation:
    Installer-Services /etc/dse/dse.yaml
    Package installations /etc/dse/dse.yaml
    Installer-No Services install_location/resources/dse/conf/dse.yaml
    Tarball installations install_location/resources/dse/conf/dse.yaml
  2. In the dse.yaml file, set the enabled option for histogram_data_options to true.
    # Column Family Histogram data tables options
    histogram_data_options:
      enabled: true
      refresh_rate_ms: 10000
      retention_count: 3
  3. (Optional) To control how often the statistics are refreshed, increase or decrease the refresh_rate_ms parameter.

    The refresh_rate_ms specifies the length of the sampling period, that is, the frequency with which this data is updated.

  4. Optional: To control the number of complete histograms kept in the tables at any one time, change the retention_count parameter.