Collecting table histogram diagnostics

Enable the histogram_data_options parameter in dse.yaml.

The following histogram diagnostics tables collect histogram data at a table level:
  • cell_count_histograms

    Cell count per partition.

  • partition_size_histograms

    Partition size.

  • read_latency_histograms

    Read latency.

  • sstables_per_read_histograms

    SSTables per read.

  • write_latency_histograms

    Write latency.

Note: These tables somewhat duplicate the information obtained by the nodetool cfhistograms utility. The major difference is that cfhistograms output is recent data, whereas the diagnostic tables contain lifetime data. Additionally, each time nodetool cfhistograms is run for a column family, the histogram values are reset; whereas the data in the diagnostic histogram tables are not.

Procedure

To enable the collection of table histogram data:

  1. Edit the dse.yaml file.
    The location of the dse.yaml file depends on the type of installation:
    Installer-Services /etc/dse/dse.yaml
    Package installations /etc/dse/dse.yaml
    Installer-No Services install_location/resources/dse/conf/dse.yaml
    Tarball installations install_location/resources/dse/conf/dse.yaml
  2. In the dse.yaml file, set the enabled option for histogram_data_options to true.
    # Column Family Histogram data tables options
    histogram_data_options:
      enabled: true
      refresh_rate_ms: 10000
      retention_count: 3
  3. (Optional) To control how often the statistics are refreshed, increase or decrease the refresh_rate_ms parameter.

    The refresh_rate_ms specifies the length of the sampling period, that is, the frequency with which this data is updated.

  4. Optional: To control the number of complete histograms kept in the tables at any one time, change the retention_count parameter.