Retrieve metric data

Using the metric retrieval methods you can retrieve performance metrics at the cluster, node, and table levels.

Additionally, you have the ability to use existing metric data to create a forecast of future data points for a specific metric. More information on forecasting is available here.

Metric retrieval Method

Metric retrieval	Method
Retrieve cluster-wide metrics.	`GET /{cluster_id}/cluster-metrics/{dc}/{metric}`
Retrieve cluster-wide metrics about a device.	`GET /{cluster_id}/cluster-metrics/{dc}/{metric}/{device}`
Retrieve cluster-wide metrics about a table.	`GET /{cluster_id}/cluster-metrics/{dc}/{ks_name}/{cf_name}/{metric}`
Retrieve metrics about a node.	`GET /{cluster_id}/metrics/{node_ip}/{metric}`
Retrieve node-specific metrics about a device.	`GET /{cluster_id}/metrics/{node_ip}/{metric}/{device}`
Retrieve node-specific metrics about a table.	`GET /{cluster_id}/metrics/{node_ip}/{ks_name}/{cf_name}/{metric}`
Retrieve a forecast for a cluster-wide metric.	`GET /{cluster_id}/cluster-metrics/{dc}/{metric}`
New way to retrieve all types of metrics.	`GET /{cluster_id}/new-metrics`

Retrieve cluster-wide metrics.

GET /{cluster_id}/cluster-metrics/{dc}/{metric}

Retrieve cluster-wide metrics about a device.

GET /{cluster_id}/cluster-metrics/{dc}/{metric}/{device}

Retrieve cluster-wide metrics about a table.

GET /{cluster_id}/cluster-metrics/{dc}/{ks_name}/{cf_name}/{metric}

Retrieve metrics about a node.

GET /{cluster_id}/metrics/{node_ip}/{metric}

Retrieve node-specific metrics about a device.

GET /{cluster_id}/metrics/{node_ip}/{metric}/{device}

Retrieve node-specific metrics about a table.

GET /{cluster_id}/metrics/{node_ip}/{ks_name}/{cf_name}/{metric}

Retrieve a forecast for a cluster-wide metric.

GET /{cluster_id}/cluster-metrics/{dc}/{metric}

New way to retrieve all types of metrics.

GET /{cluster_id}/new-metrics

You can choose from a large number of metrics keys to pass with these methods, making retrieval of a wide spectrum of performance information possible.

Control the metric data output

You can also use the following query parameters with these methods to control the output:

Query parameter Description

Query parameter	Description
start	(optional) A timestamp in seconds indicating the beginning of the time range to fetch. When omitted, this defaults to one day before the `end` parameter.
end	(optional) A timestamp in seconds indicating the end of the time range to fetch. When omitted, this defaults to the current time.
step	(optional) The resolution of the data points for the metric. Valid input options are: 1, 5, 120, or 1440 minutes; corresponding output intervals are `60`, `300`, `7200`, or `86400` seconds. The default is a 1 minute step.
step	(optional; new-metrics API) The new metrics api requires that the step argument be specified in seconds rather than minutes. This is to stay consistent with the return format of the new api. Valid inputs in this case are: `60`, `300`, `7200`, and `86400`.
function	(optional) The type of aggregation to perform on the metric: `min`, `max`, or `average`. By default, results are returned for all three types of aggregation.
forecast	(optional) A boolean flag indicating that we would like to generate a forecast for the time range and step specified. This will use past data to calculate projected data points in the time range specified.

start

(optional) A timestamp in seconds indicating the beginning of the time range to fetch. When omitted, this defaults to one day before the end parameter.

end

(optional) A timestamp in seconds indicating the end of the time range to fetch. When omitted, this defaults to the current time.

step

(optional) The resolution of the data points for the metric. Valid input options are: 1, 5, 120, or 1440 minutes; corresponding output intervals are 60, 300, 7200, or 86400 seconds. The default is a 1 minute step.

step

(optional; new-metrics API) The new metrics api requires that the step argument be specified in seconds rather than minutes. This is to stay consistent with the return format of the new api. Valid inputs in this case are: 60, 300, 7200, and 86400.

function

(optional) The type of aggregation to perform on the metric: min, max, or average. By default, results are returned for all three types of aggregation.

forecast

(optional) A boolean flag indicating that we would like to generate a forecast for the time range and step specified. This will use past data to calculate projected data points in the time range specified.

Results of calls to retrieve metrics are returned in the following format:

{
  [<node_ip>: | <device>: | <keyspace.columnfamily>:]
    {
      <function>:
        [
          [<timestamp> <value>],
          ...
        ]
    }
}

By default, the output is metric data points at 60-second intervals over a 24-hour period. Data points are listed in chronological order, starting with the oldest data point first.

GET /{cluster_id}/cluster-metrics/{dc}/{metric}

Aggregate a metric across multiple nodes in the cluster rather than retrieving data about a single node.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster configuration ID.
dc: The name of the datacenter for the nodes. Use the name all to aggregate a metric across all datacenters.
metric: One of the Cluster metrics keys.

Query params: parameters: The parameters listed in Control the metric data output.

Returns metric data across multiple nodes in a cluster.

Example:

Get the average write requests per second over to the cluster over all data centers on May 1, 2012 from 8 AM to 5 PM GMT. Show data points at 2-hour (120-minute) intervals.

Using curl’s -G flag to build an HTTP GET request:

curl -G
http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/write-ops
  -d 'step=120'
  -d 'start=1335859200'
  -d 'end=1335891600'
  -d 'function=average'

Manually building an HTTP GET request:

curl 'http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/write-ops?step=120&start=1335859200&end=1335891600&function=average'

Output:

Data points at 2-hour (7200 seconds) intervals show the number of write requests per second during business hours on May 1.

{
  "Total": {
    "AVERAGE": [
      [
        1335859200,
        null
      ],
      [
        1335866400,
        13.376885890960693
      ],
      [
        1335873600,
        13.372154712677002
      ],
      [
        1335880800,
        13.365732669830322
      ],
      [
        1335888000,
        13.392115592956543
      ]
    ]
  }
}

GET /{cluster_id}/cluster-metrics/{dc}/{metric}/{device}

Aggregate a disk or network metric, which pertains to a specific device, across multiple nodes in the cluster rather than retrieving data about a single node.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster config ID.
dc: The name of the datacenter for the nodes. Use the name all to aggregate a metric across all datacenters.
metric: One of the Cluster metrics keys or Operating system metrics keys.
device: The device to be measured, which the node object lists. Use the name all to measure all devices, For example, when requesting a disk metric, all aggregates metrics from all disk devices.

Query params: parameters: The parameters are listed in Control the metric data output.

Examples of device arguments:

To determine the set of network interfaces that metrics are available for, you can run a query similar to the following:

 curl http://localhost:8888/Test_Cluster/nodes/192.168.1.1/network_interfaces

 ["lo0", "eth0", "eth1"]

In this case, lo0, eth0, and eth1 can all be used.

Disk devices can be discovered in a similar way.

 curl http://localhost:8888/Test_Cluster/nodes/192.168.1.1/devices

{
  "commitlog": "sdb",
  "data": ["sda"],
  "saved_caches": "sda",
  "other": ["sdc"]
}

In this case, any of sda, sdb, or sdc may be used.

Finally, metrics are also captured for disk partitions and filesystems:

 curl http://localhost:8888/Test_Cluster/nodes/192.168.1.1/partitions

{
  "commitlog": "/dev/sdb1",
  "data": ["/dev/sda1"],
  "saved_caches": "/dev/sda1",
  "other": ["/dev/sdc1"]
}

Here, the available partitions are /dev/sda1, /dev/sdb1, and /dev/sdc1. Keep in mind that you will need to URL-encode the items, so /dev/sda1 will become %2Fdev%2Fsda1.

Using a partition, network interface, or other device name for the device argument returns disk or network metric data about a specific device across multiple nodes. Using all for the device name returns a dictionary of keys (device names) and the values (results for that device).

Example:

Get the average GB of space on all disks in all datacenters used each day by the cluster from April 11, 2012 00:00:00 to April 26, 2012 00:00:00 GMT.

Using curl’s -G flag to build an HTTP GET request:

curl -G
  http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/os-disk-used/all
  -d 'step=1440'
  -d 'start=1334102400'
  -d 'end=1335398400'
  -d 'function=average'

Manually building an HTTP GET request:

 curl 'http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/os-disk-used/all?step=1440&start=1334102400&end=1335398400&function=average'

Output:

{
  "Total": {
    "AVERAGE": [
      [
        1334102400,
        null
      ],
      [
        1334188800,
        21.000694274902344
      ],
      [
        1334275200,
        8.736943244934082
      ],
      [
        1334361600,
        9.0
      ],
      [
        1334448000,
        19.0
      ],
      [
        1334534400,
        19.0
      ],
      [
        1334620800,
        19.0
      ],
      [
        1334707200,
        19.0
      ],
      [
        1334793600,
        18.629029273986816
      ],
      [
        1334880000,
        19.923184394836426
      ],
      [
        1334966400,
        25.0
      ],
      [
        1335052800,
        25.0
      ],
      [
        1335139200,
        25.923053741455078
      ],
      [
        1335225600,
        26.0
      ],
      [
        1335312000,
        26.549484252929688
      ]
    ]
  }
}

GET /{cluster_id}/cluster-metrics/{dc}/{ks_name}/{cf_name}/{metric}

Aggregate a table metric across multiple nodes in the cluster rather than retrieving data about a single node.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster configuration ID.
dc: The name of the datacenter for the nodes. Use the name all to aggregate a metric across all datacenters.
ks_name: The keyspace that contains the table to be measured.
cf_name: The table to be measured.
metric: One of the Cluster metrics keys.

Query params: parameters: The parameters listed in Control the metric data output.

Returns metric data for multiple nodes.

Example:

Get the maximum bytes of disk space used for live data by the Users table in the Keyspace1 keyspace of the cluster over all datacenters from May 1, 2012 00:00:00 to May 5, 2012 00:00:00 GMT:

Using curl’s -G flag to build an HTTP GET request:

curl -G
http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/Keyspace1/Users/cf-live-disk-used
-d 'function=max'
-d 'start=1335830400'
-d 'end=1336176000'
-d 'step=1440'

Manually building an HTTP GET request:

 curl 'http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/Keyspace1/Users/cf-live-disk-used?function=max&start=1335830400&end=1336176000&step=1440'

Output:

Data points at 24-hour intervals show the metrics for the period.

{
  "Total": {
    "MAX": [
      [
        1335830400,
        9740462592.0
      ],
      [
        1335916800,
        9932527616.0
      ],
      [
        1336003200,
        null
      ],
      [
        1336089600,
        10644448512.0
      ]
    ]
  }
}

GET /{cluster_id}/metrics/{node_ip}/{metric}

Retrieve metric data for a single node.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster configuration ID.
node_ip: IP address of the target node.
metric: One of the Cluster metrics keys.

Query params: parameters: The parameters listed in Control the metric data output.

Returns metric data for a single node.

Example:

Get the daily average data load on cluster node 10.11.12.150 from April 20, 2012 00:00:00 to April 26, 2012 00:00:00 GMT:

Using curl’s -G flag to build an HTTP GET request:

curl -G
  http://127.0.0.1:8888/Test_Cluster/metrics/10.11.12.150/data-load
  -d 'step=1440'
  -d 'start=1334880000'
  -d 'end=1335398400'
  -d 'function=average'

Manually building an HTTP GET request:

 curl 'http://127.0.0.1:8888/Test_Cluster/metrics/10.11.12.150/data-load?step=1440&start=1334880000&end=1335398400&function=average'

Output:

{
  "10.11.12.150": {
    "AVERAGE": [
      [
        1334880000,
        null
      ],
      [
        1334966400,
        6353770496.0
      ],
      [
        1335052800,
        6560092672.0
      ],
      [
        1335139200,
        6019291136.0
      ],
      [
        1335225600,
        6149050880.0
      ],
      [
        1335312000,
        6271239680.0
      ]
    ]
  }
}

GET /{cluster_id}/metrics/{node_ip}/{metric}/{device}

Aggregate a disk or network metric for a single node.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster configuration ID.
node_ip: IP address of the target node.
metric: One of the Cluster metrics keys or Operating system metrics keys.
device: The device to be measured. Use the name all to measure all devices associated with a disk metric. See GET /{cluster_id}/cluster-metrics/{dc}/{metric}/{device} for examples of devices.

Query params: parameters: The parameters listed in Control the metric data output.

Returns disk or network metrics data for a single node.

Example:

Get the maximum GB of disk space for all disks used by cluster node 10.11.12.150 from April 30, 2012 at 22:05 to May 1, 2012 8:00:00 GMT:

Using curl’s -G flag to build an HTTP GET request:

curl -G
  http://127.0.0.1:8888/Test_Cluster/metrics/10.11.12.150/os-disk-used/all
    -d 'start=1335823500'
    -d 'end=1335859200'
    -d 'step=120'
    -d 'function=max'

Manually building an HTTP GET request:

 curl 'http://127.0.0.1:8888/Test_Cluster/metrics/10.11.12.150/os-disk-used/all?start=1335823500&end=1335859200&step=120&function=max'

Output:

Data points at 2-minute intervals show the disk space used by device /dev/sda1.

{
  "/dev/sda1": {
    "MAX": [
      [
        1335823200,
        null
      ],
      [
        1335830400,
        17.0
      ],
      [
        1335837600,
        16.0
      ],
      [
        1335844800,
        17.0
      ],
      [
        1335852000,
        16.0
      ]
    ]
  }
}

GET /{cluster_id}/metrics/{node_ip}/{ks_name}/{cf_name}/{metric}

Retrieve metric data about a table on a single node.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster configuration ID.
node_ip: IP address of the target node.
ks_name: The keyspace that contains the table to be measured.
cf_name: The table to be measured.
metric : One of the Cluster metrics keys.

Query params: parameters: The parameters listed in Control the metric data output.

Example:

Get the daily, maximum response time (in microseconds) to write requests on the Users table in the Keyspace1 keyspace by cluster node 10.11.12.150 from May 1, 2012 at 00:00:00 to May 5, 2012 00:00:00 GMT.

Using curl’s -G flag to build an HTTP GET request:

curl -G
  http://127.0.0.1:8888/Test_Cluster/metrics/10.11.12.150/Keyspace1/Users/cf-write-latency-op
  -d 'function=max'
  -d 'start=1335830400'
  -d 'end=1336176000'
  -d 'step=1440'

Manually building an HTTP GET request:

 curl 'http://127.0.0.1:8888/Test_Cluster/metrics/10.11.12.150/Keyspace1/Users/cf-write-latency-op?function=max&start=1335830400&end=1336176000&step=1440'

Output:

{
  "OpsCenter.rollups60": {
    "MAX": [
      [
        1335830400,
        102.28681945800781
      ],
      [
        1335916800,
        124.86614227294922
      ],
      [
        1336003200,
        null
      ],
      [
        1336089600,
        127.14733123779297
      ]
    ]
  }
}

GET /{cluster_id}/cluster-metrics/{dc}/{metric}

Generate a forecast for a metric aggregated across the cluster.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster configuration ID.
dc: The name of the datacenter for the nodes. Use the name all to aggregate a metric across all datacenters.
metric : One of the Cluster metrics keys.

Query params: parameters: The parameters listed in Control the metric data output

Example:

Forecast the average write requests per second over for the cluster over all data centers starting from the current time to 4 weeks in the future. Show data points at 1 day (1440-minute) intervals.

Using curl’s -G flag to build an HTTP GET request:

curl -G
  http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/write-ops
    -d "step=1440"
    -d "start=`date +'%s'`"
    -d "end=`date -v+4w +'%s'`"
    -d "forecast=1"

Manually building an HTTP GET request:

 curl "http://127.0.0.1:8888/Test_Cluster/cluster-metrics/all/write-ops?step=1440&start=`date +'%s'`&end=`date -v+4w +'%s'`&forecast=1"

Output:

Data points at 1 day (86400 seconds) intervals show the forecasted number of write requests per day for the next 4 weeks. The results will include the data used to generate the forecast. In this example the forecast is based on 12 weeks of data, so the results begin 12 weeks in the past.

{
  "Total": {
      "AVERAGE": [
          [
              1376006400,
              172.18471918718131
          ],
          [
              1376092800,
              182.06741811718813
          ],
          [
              1376179200,
              159.14967219176917
          ],
          ...
          [
              1385769600,
              202.93040370941162
          ],
          [
              1385856000,
              202.78100836277008
          ],
          [
              1385942400,
              202.59301888942719
          ]
      ]
  }
}

GET /{cluster_id}/new-metrics

Retrieve metric data for a single node.

Path arguments:

cluster_id: The ID of a cluster returned from the cluster configuration ID.

Query params:

nodes: A comma separated list of nodes to fetch data for. Either this or node_group must be specified.
node_group: A convenient way of specifying a group of nodes to retrieve data for. Can be '*' for all nodes, or the name of a datacenter for the nodes in that datacenter. Either this or nodes must be specified.
metrics: A comma separated list of the Cluster metrics keys that fetch data. When you fetch multiple metrics, all metrics are fetched using the same nodes, start, end, etc., parameters.
columnfamilies: A comma separated list of '<keyspace>.columnfamily' strings indicating the tables for which to fetch the given metrics. Required when fetching metrics that are specifrc to a certain table.
tiers: A comma separated list of storage tier numbers indicating the tiers for which to fetch the given metrics. Required when fetching metrics are specific to a certain storage tier.
devices: A comma separated list of device strings indicating the devices for which to fetch the given metrics. Required when fetching metrics that are specific to a certain disk or network device.
node_aggregation: Indicates whether or not to aggregate the results across nodes. A '0' value indicates false and a '1' value indicates true.
additional parameters: The parameters listed in Control the metric data output.

Returns metric data.

Examples:

Get the daily average data load on cluster nodes 10.11.12.150, 10.11.12.151 from April 20, 2012 00:00:00 to April 26, 2012 00:00:00 GMT:

Using curl’s -G flag to build an HTTP GET request:

curl -G
  http://127.0.0.1:8888/Test_Cluster/new-metrics
  -d 'metrics=data-load'
  -d 'nodes=10.11.12.150,10.11.12.151'
  -d 'step=86400'
  -d 'start=1334880000'
  -d 'end=1335398400'

Manually building an HTTP GET request:

 curl 'http://127.0.0.1:8888/Test_Cluster/new-metrics?metrics=data-load&nodes=10.11.12.150,10.11.12.151&step=86400&start=1334880000&end=1335398400'

Output:

{
  "metrics": ["data-load"],
  "bounds": {"start": 1334880000, "end": 1335312000, "step": 86400},
  "aggregation_function": null,
  "nodes": ["10.11.12.150", "10.11.12.151"],
  "data": {
    "10.11.12.150": [
      {"metric": "data-load",
       "data-points":
          [
              [4353770496.0, 4353770496.0, 4353770496.0],
              [6353770496.0, 6353770496.0, 6353770496.0],
              [6560092672.0, 6560092672.0, 6560092672.0],
              [6019291136.0, 6019291136.0, 6019291136.0],
              [6149050880.0, 6149050880.0, 6149050880.0],
              [6271239680.0, 6271239680.0, 6271239680.0]
          ]
      }
    ],
    "10.11.12.151": [
      {"metric": "data-load",
       "data-points":
          [
              [4353770496.0, 4353770496.0, 4353770496.0],
              [6353770496.0, 6353770496.0, 6353770496.0],
              [6560092672.0, 6560092672.0, 6560092672.0],
              [6019291136.0, 6019291136.0, 6019291136.0],
              [6149050880.0, 6149050880.0, 6149050880.0],
              [6271239680.0, 6271239680.0, 6271239680.0]
          ]
      }
    ]
  }
}

Get the cluster average for data load and write ops from April 20, 2012 00:00:00 to April 2, 2012 00:00:00 GMT:

Using curl’s -G flag to build an HTTP GET request:

curl -G
http://127.0.0.1:8888/Test_Cluster/new-metrics
  -d 'metrics=data-load,write-latency-op'
  -d 'node_group=*'
  -d 'step=86400'
  -d 'start=1334880000'
  -d 'end=1335398400'
  -d 'node_aggregation=1'

Manually building an HTTP GET request:

curl 'http://127.0.0.1:8888/Test_Cluster/new-metrics?metrics=data-load,write-latency-op&node_group=*&step=86400&start=1334880000&end=1335398400&node_aggregation=1'

Output:

{
  "metrics": ["data-load", "write-latency-op"],
  "bounds": {"start": 1334880000, "end": 1335312000, "step": 86400},
  "aggregation_function": {
    "data-load": "sum",
    "write-latency-op": "average"
  }
  "nodes": ["10..11.12.150", "10.11.12.151"],
  "data": {
    "aggregate": [
      {"metric": "data-load",
       "data-points":
          [
              [4353770496.0, 4353770496.0, 4353770496.0],
              [6353770496.0, 6353770496.0, 6353770496.0],
              [6560092672.0, 6560092672.0, 6560092672.0],
              [6019291136.0, 6019291136.0, 6019291136.0],
              [6149050880.0, 6149050880.0, 6149050880.0],
              [6271239680.0, 6271239680.0, 6271239680.0]
          ]
      },
      {"metric": "write-latency-op",
       "data-points":
          [
              [4353770496.0, 4353770496.0, 4353770496.0],
              [6353770496.0, 6353770496.0, 6353770496.0],
              [6560092672.0, 6560092672.0, 6560092672.0],
              [6019291136.0, 6019291136.0, 6019291136.0],
              [6149050880.0, 6149050880.0, 6149050880.0],
              [6271239680.0, 6271239680.0, 6271239680.0]
          ]
      },
    ]
  }
}

Get the write-ops for multiple cfs for all nodes from April 20, 2012 00:00:00 to April 2, 2012 00:00:00 GMT:

Using curl’s -G flag to build an HTTP GET request:

curl -G
http://127.0.0.1:8888/Test_Cluster/new-metrics
  -d 'metrics=cf-write-ops'
  -d 'node_group=*'
  -d 'columnfamilies=OpsCenter.events,OpsCenter.settings'
  -d 'step=86400'
  -d 'start=1334880000'
  -d 'end=1335398400'

Manually building an HTTP GET request:

curl 'http://127.0.0.1:8888/Test_Cluster/new-metrics?metrics=cf-write-ops&node_group=*&columnfamilies=OpsCenter.events,OpsCenter.settings&step=86400&start=1334880000&end=1335398400'

Output:

{
  "metrics": ['cf-write-ops'],
  "bounds": {"start": 1334880000, "end": 1335312000, "step": 86400},
  "aggregation_function": null
  "nodes": ["10.11.12.150", "10.11.12.151"],
  "columnfamilies": ["OpsCenter.events", "OpsCenter.settings"],
  "data": {
    "10.11.12.150": [
      {"metric": "cf-write-ops",
       "columnfamily": "OpsCenter.events",
       "data-points":
          [
              [4353770496.0, 4353770496.0, 4353770496.0],
              [6353770496.0, 6353770496.0, 6353770496.0],
              [6560092672.0, 6560092672.0, 6560092672.0],
              [6019291136.0, 6019291136.0, 6019291136.0],
              [6149050880.0, 6149050880.0, 6149050880.0],
              [6271239680.0, 6271239680.0, 6271239680.0]
          ]
      },
    ],
    "10.11.12.151": [
      {"metric": "cf-write-ops",
       "columnfamily": "OpsCenter.settings",
       "data-points":
          [
              [4353770496.0, 4353770496.0, 4353770496.0],
              [6353770496.0, 6353770496.0, 6353770496.0],
              [6560092672.0, 6560092672.0, 6560092672.0],
              [6019291136.0, 6019291136.0, 6019291136.0],
              [6149050880.0, 6149050880.0, 6149050880.0],
              [6271239680.0, 6271239680.0, 6271239680.0]
          ]
      },
    ]
  }
}

Metrics attribute key lists

This section contains these tables of metric keys to use with resources that retrieve OpsCenter performance data:

Cluster metrics keys
Thread pool metrics keys
Table metrics keys
Storage tier metrics keys
Operating system metrics keys

Cluster metrics keys

This list of keys corresponds to database metrics collected by OpsCenter:

Key	Units	Description
write-ops	/sec	The number of write requests per second on the coordinator nodes, analogous to client writes. Monitoring the number of requests over a given time period reveals system write workload and usage patterns.
write-histogram	ms/op	The min, median, max, 90th, and 99th percentiles of a client writes. The time period starts when a node receives a client write request, and ends when the node responds back to the client. Depending on consistency level and replication factor, this may include the network latency from writing to the replicas.
write-failures	/sec	The number of write requests on the coordinator nodes that fail due to errors returned from replicas.
write-timeouts	/sec	The number of server write timeouts per second on the coordinator nodes.
write-unavailables	/sec	The number of write requests per second on the coordinator nodes, that fail because not enough replicas are available.
read-ops	/sec	The number of read requests per second on the coordinator nodes, analogous to client reads. Monitoring the number of requests over a given time period reveals system read workload and usage patterns.
read-histogram	ms/op	The min, median, max, 90th, and 99th percentiles of a client reads. The time period starts when a node receives a client read request, and ends when the node responds back to the client. Depending on consistency level and replication factor, this may include the network latency from requesting the data’s replicas.
read-failures	/sec	The number of read requests on the coordinator nodes that fail due to errors returned from replicas.
read-timeouts	/sec	The number of server read timeouts per second on the coordinator nodes.
read-unavailables	/sec	The number of read requests per second on the coordinator nodes, that fail because not enough replicas are available.
nonheap-committed	—	Allocated memory, guaranteed for Java nonheap.
nonheap-max	—	Maximum amount that the Java nonheap can grow.
nonheap-used	—	Average amount of Java nonheap memory used.
heap-committed	—	Allocated memory guaranteed for the Java heap.
heap-max	—	Maximum amount that the Java heap can grow.
heap-used	—	Average amount of Java heap memory used.
cms-collection-count	/sec	Number of concurrent mark sweep garbage collections performed per second.
par-new-collection-count	/sec	Number of ParNew garbage collections performed per second. ParNew collections pause all work in the JVM but should finish quickly.
cms-collection-time	ms/sec	Average number of milliseconds spent performing CMS garbage collections per second.
par-new-collection-time	ms/sec	Average number of milliseconds spent performing ParNew garbage collections per second. ParNew collections pause all work in the JVM but should finish quickly.
g1-old-collection-count	/sec	Number of G1 old generation garbage collections performed per second.
g1-old-collection-time	ms/sec	Average number of milliseconds spent performing G1 old generation garbage collections per second.
g1-young-collection-count	/sec	Number of G1 young generation garbage collections performed per second.
g1-young-collection-time	ms/sec	Average number of milliseconds spent performing G1 young generation garbage collections per second.
data-load	—	The live disk space used by all tables on a node.
total-bytes-compacted	/sec	Number of bytes compacted per second.
actual-total-compactions-completed	/sec	Number of compaction tasks completed per second.
total-compactions-completed	/sec	Number of sstable scans per second that could result in a compaction.
pending-compaction-tasks	—	Estimated number of compactions required to achieve the desired state. This includes the pending queue to the compaction executor and additional tasks that may be created from their completion.
dropped-counter-mutations	drops/sec	Mutation was seen after the timeout (write_request_timeout_in_ms) so was thrown away. This client might have timed out before it met the required consistency level, but might have succeeded as well. Hinted handoffs and read repairs should resolve inconsistencies but a repair can ensure it.
dropped-mutations	drops/sec	Mutation was seen after the timeout (write_request_timeout_in_ms) so was thrown away. This client might have timed out before it met the required consistency level, but might have succeeded as well. Hinted handoffs and read repairs should resolve inconsistencies but a repair can ensure it.
dropped-reads	drops/sec	A local read request was received after the timeout (read_request_timeout_in_ms) so it was thrown away because it would have already either been completed and sent to client or sent back as a timeout error.
dropped-ranged-slice-reads	drops/sec	A local ranged read request was received after the timeout (range_request_timeout_in_ms) so it was thrown away because it would have already either been completed and sent to client or sent back as a timeout error.
dropped-read-repairs	drops/sec	The Mutation was seen after the timeout (write_request_timeout_in_ms) so was thrown away. With the read repair timeout, the node still exists in an inconsistent state.
key-cache-hits	/sec	The number of key cache hits per second. This will avoid possible disk seeks when finding a partition in an SSTable. This metric only applies to SSTables created by DSE versions earlier than 6.0.
key-cache-requests	/sec	The number of key cache requests per second. This metric only applies to SSTables created by DSE versions earlier than 6.0.
key-cache-hit-rate	—	The percentage of key cache lookups that resulted in a hit. This metric only applies to SSTables created by DSE versions earlier than 6.0.
row-cache-hits	/sec	The number of row cache hits per second.
row-cache-requests	/sec	The number of row cache requests per second.
row-cache-hit-rate	—	The percentage of row cache lookups that resulted in a hit.
native-connections	—	The number of clients connected using the native protocol.
read-repair-attempted	/sec	Number of read requests where the number of nodes queried possibly exceeds the consistency level requested in order to check for a possible digest mismatch.
read-repaired-background	/sec	Corresponds to a digest mismatch that occurred after a completed read, outside of the client read loop.
read-repaired-blocking	/sec	Corresponds to the number of times there was a digest mismatch within the requested consistency level and a full data read was started.
speculative-retries	retries	Number of speculative retries for all column families.
stream-out-total	/sec	Data streamed out from this node to all other nodes, for all tables.
stream-in-total	/sec	Data streams in to this node from all other nodes, for all tables.
hint-creation-rate	/sec	Rate at which new individual hints are stored on this node, to be replayed to peers.
in-memory-percent-used	—	The percentage of memory allocated for in-memory tables currently in use.
view-write-histogram	ms/op	The min, median, max, 90th, and 99th percentiles of the time from when base mutation is applied to memtable until CL.ONE is achieved on the async write to the tables materialized views. An estimate to determine the lag between base table mutations and the views consistency.
view-replicas-success	mutations	Number of view mutations sent to replicas that have been acknowledged.
view-replicas-pending	mutations	Number of view mutations sent to replicas where the replicas acknowledgement hasn’t been received.
cells-scanned-during-read	cells	The min, median, max, 90th, and 99th percentile of how many cells were scanned during a read.
pending-graph-query-threads	—	Number of pending tasks in the GraphQueryThreads thread pool.
active-graph-query-threads	—	Number of active tasks in the GraphQueryThreads thread pool.
completed-graph-query-threads	—	Number of tasks completed by the GraphQueryThreads thread pool.
pending-graph-scheduled-threads	—	Number of pending tasks in the GraphScheduledThreads thread pool.
active-graph-scheduled-threads	—	Number of active tasks in the GraphScheduledThreads thread pool.
completed-graph-scheduled-threads	—	Number of tasks completed by the GraphScheduledThreads thread pool.
pending-graph-system-threads	—	Number of pending tasks in the GraphSystemThreads thread pool.
active-graph-system-threads	—	Number of active tasks in the GraphSystemThreads thread pool.
completed-graph-system-threads	—	Number of tasks completed by the GraphSystemThreads thread pool.
pending-gremlin-worker-threads	—	Number of pending tasks in the GremlinWorkerThreads thread pool.
active-gremlin-worker-threads	—	Number of active tasks in the GremlinWorkerThreads thread pool.
completed-gremlin-worker-threads	—	Number of tasks completed by the GremlinWorkerThreads thread pool.
percentage-repaired	%	Percentage of data (uncompressed) marked as repaired across all non-system tables on a node. Tables with a replication factor of 1 are excluded.
read-coordinator-nonreplica	/sec	Rate of coordinated reads to a node where that node is not a replica for that partition.
read-coordinator-preferother	/sec	Rate of coordinated reads to a node where that node did not choose itself as a replica for the read request.
hints-on-disk	—	The number of hints currently stored on disk, to be replayed to peers.
hint-replay-success-rate	/sec	Rate of successful individual hint replays to peers. If one or more individual hints fail to replay in a batch, the successful hints in that batch are replayed again and double counted in this metric.
hint-replay-error-rate	/sec	Rate of failed individual hint replays. Replay of a single hint can fail more than once if retried.
hint-replay-timeout-rate	/sec	Rate of timed out individual hint replays. Replay of a single hint can timeout more than once if retried.
hint-replay-received-rate	/sec	Rate of successful individual hints replayed to this node, from other peers.
cross-node-latency	ms/op	The min, median, max, 90th, and 99th percentiles of the latency of messages between nodes. The time period starts when a node sends a message and ends when the current node receives it.
nodesync-data-repaired	bytes	Bytes of data that were inconsistent and needed synchronization.
nodesync-data-validated	bytes	Bytes of data checked for consistency.
nodesync-repair-data-sent	bytes	Total bytes of data transferred between all nodes during synchronization.
nodesync-objects-repaired	objects	Number of rows and range tombstones that were inconsistent and needed synchronization.
nodesync-objects-validated	objects	Number of rows and range tombstones checked for consistency.
nodesync-repair-objects-sent	objects	Total number of rows and range tombstones transferred between all nodes during synchronization.
nodesync-processed-pages	pages	Number of pages (internal groupings of data) processed.
nodesync-full-in-sync-pages	pages	Number of processed pages that were not in need of synchronization.
nodesync-full-repaired-pages	pages	Number of processed pages that were in need of synchronization.
nodesync-partial-in-sync-pages	pages	Number of in sync pages for which a response was gotten from only a partial number of replicas.
nodesync-partial-repaired-pages	pages	Number of repaired pages for which a response was gotten from only a partial number of replicas.
nodesync-uncompleted-pages	pages	Number of processed pages not having enough responses to perform synchronization.
nodesync-failed-pages	pages	Number of processed pages for which an unknown error prevented proper synchronization completion.
dropped-view-mutations	drops/sec	Mutation of Materialized View was seen after the timeout (write_request_timeout_in_ms) so was thrown away. This client might have timed out before it met the required consistency level, but might have succeeded as well. Hinted handoffs and read repairs should resolve inconsistencies but a repair can ensure it.
dropped-lwt	drops/sec	Lightweight Transaction was seen after the timeout (write_request_timeout_in_ms) so was thrown away. This client might have timed out before it met the required consistency level, but might have succeeded as well. Hinted handoffs and read repairs should resolve inconsistencies but a repair can ensure it.
dropped-hints	drops/sec	Hinted Handoff was seen after the timeout (write_request_timeout_in_ms) so was thrown away. Repairing the data or using NodeSync, should resolve data inconsistencies.
dropped-truncates	drops/sec	Truncate operation was seen after the timeout (truncate_request_timeout_in_ms) so was thrown away.
dropped-snapshots	drops/sec	Snapshot Request was seen after the timeout (request_timeout_in_ms) so was thrown away. Snapshot should be retried.
dropped-schemas	drops/sec	Schema change was seen after the timeout (request_timeout_in_ms) so was thrown away. Schema agreement may not have been reached immediately, but this will eventually resolve itself.
dropped-repairs	drops/sec	Repair message was seen after the timeout so was thrown away.
dropped-other	drops/sec	Miscellaneous message was seen after the timeout so was thrown away.
dropped-node-sync	drops/sec	Node-sync message was seen after the timeout so was thrown away.
dropped-batch-store	drops/sec	Batch store message was seen after the timeout so was thrown away.

Key

Units

Description

write-ops

/sec

The number of write requests per second on the coordinator nodes, analogous to client writes. Monitoring the number of requests over a given time period reveals system write workload and usage patterns.

write-histogram

ms/op

The min, median, max, 90th, and 99th percentiles of a client writes. The time period starts when a node receives a client write request, and ends when the node responds back to the client. Depending on consistency level and replication factor, this may include the network latency from writing to the replicas.

write-failures

/sec

The number of write requests on the coordinator nodes that fail due to errors returned from replicas.

write-timeouts

/sec

The number of server write timeouts per second on the coordinator nodes.

write-unavailables

/sec

The number of write requests per second on the coordinator nodes, that fail because not enough replicas are available.

read-ops

/sec

The number of read requests per second on the coordinator nodes, analogous to client reads. Monitoring the number of requests over a given time period reveals system read workload and usage patterns.

read-histogram

ms/op

The min, median, max, 90th, and 99th percentiles of a client reads. The time period starts when a node receives a client read request, and ends when the node responds back to the client. Depending on consistency level and replication factor, this may include the network latency from requesting the data’s replicas.

read-failures

/sec

The number of read requests on the coordinator nodes that fail due to errors returned from replicas.

read-timeouts

/sec

The number of server read timeouts per second on the coordinator nodes.

read-unavailables

/sec

The number of read requests per second on the coordinator nodes, that fail because not enough replicas are available.

nonheap-committed

—

Allocated memory, guaranteed for Java nonheap.

nonheap-max

—

Maximum amount that the Java nonheap can grow.

nonheap-used

—

Average amount of Java nonheap memory used.

heap-committed

—

Allocated memory guaranteed for the Java heap.

heap-max

—

Maximum amount that the Java heap can grow.

heap-used

—

Average amount of Java heap memory used.

cms-collection-count

/sec

Number of concurrent mark sweep garbage collections performed per second.

par-new-collection-count

/sec

Number of ParNew garbage collections performed per second. ParNew collections pause all work in the JVM but should finish quickly.

cms-collection-time

ms/sec

Average number of milliseconds spent performing CMS garbage collections per second.

par-new-collection-time

ms/sec

Average number of milliseconds spent performing ParNew garbage collections per second. ParNew collections pause all work in the JVM but should finish quickly.

g1-old-collection-count

/sec

Number of G1 old generation garbage collections performed per second.

g1-old-collection-time

ms/sec

Average number of milliseconds spent performing G1 old generation garbage collections per second.

g1-young-collection-count

/sec

Number of G1 young generation garbage collections performed per second.

g1-young-collection-time

ms/sec

Average number of milliseconds spent performing G1 young generation garbage collections per second.

data-load

—

The live disk space used by all tables on a node.

total-bytes-compacted

/sec

Number of bytes compacted per second.

actual-total-compactions-completed

/sec

Number of compaction tasks completed per second.

total-compactions-completed

/sec

Number of sstable scans per second that could result in a compaction.

pending-compaction-tasks

—

Estimated number of compactions required to achieve the desired state. This includes the pending queue to the compaction executor and additional tasks that may be created from their completion.

dropped-counter-mutations

drops/sec

Mutation was seen after the timeout (write_request_timeout_in_ms) so was thrown away. This client might have timed out before it met the required consistency level, but might have succeeded as well. Hinted handoffs and read repairs should resolve inconsistencies but a repair can ensure it.

dropped-mutations

drops/sec

dropped-reads

drops/sec

A local read request was received after the timeout (read_request_timeout_in_ms) so it was thrown away because it would have already either been completed and sent to client or sent back as a timeout error.

dropped-ranged-slice-reads

drops/sec

A local ranged read request was received after the timeout (range_request_timeout_in_ms) so it was thrown away because it would have already either been completed and sent to client or sent back as a timeout error.

dropped-read-repairs

drops/sec

The Mutation was seen after the timeout (write_request_timeout_in_ms) so was thrown away. With the read repair timeout, the node still exists in an inconsistent state.

key-cache-hits

/sec

The number of key cache hits per second. This will avoid possible disk seeks when finding a partition in an SSTable. This metric only applies to SSTables created by DSE versions earlier than 6.0.

key-cache-requests

/sec

The number of key cache requests per second. This metric only applies to SSTables created by DSE versions earlier than 6.0.

key-cache-hit-rate

—

The percentage of key cache lookups that resulted in a hit. This metric only applies to SSTables created by DSE versions earlier than 6.0.

row-cache-hits

/sec

The number of row cache hits per second.

row-cache-requests

/sec

The number of row cache requests per second.

row-cache-hit-rate

—

The percentage of row cache lookups that resulted in a hit.

native-connections

—

The number of clients connected using the native protocol.

read-repair-attempted

/sec

Number of read requests where the number of nodes queried possibly exceeds the consistency level requested in order to check for a possible digest mismatch.

read-repaired-background

/sec

Corresponds to a digest mismatch that occurred after a completed read, outside of the client read loop.

read-repaired-blocking

/sec

Corresponds to the number of times there was a digest mismatch within the requested consistency level and a full data read was started.

speculative-retries

retries

Number of speculative retries for all column families.

stream-out-total

/sec

Data streamed out from this node to all other nodes, for all tables.

stream-in-total

/sec

Data streams in to this node from all other nodes, for all tables.

hint-creation-rate

/sec

Rate at which new individual hints are stored on this node, to be replayed to peers.

in-memory-percent-used

—

The percentage of memory allocated for in-memory tables currently in use.

view-write-histogram

ms/op

The min, median, max, 90th, and 99th percentiles of the time from when base mutation is applied to memtable until CL.ONE is achieved on the async write to the tables materialized views. An estimate to determine the lag between base table mutations and the views consistency.

view-replicas-success

mutations

Number of view mutations sent to replicas that have been acknowledged.

view-replicas-pending

mutations

Number of view mutations sent to replicas where the replicas acknowledgement hasn’t been received.

cells-scanned-during-read

cells

The min, median, max, 90th, and 99th percentile of how many cells were scanned during a read.

pending-graph-query-threads

—

Number of pending tasks in the GraphQueryThreads thread pool.

active-graph-query-threads

—

Number of active tasks in the GraphQueryThreads thread pool.

completed-graph-query-threads

—

Number of tasks completed by the GraphQueryThreads thread pool.

pending-graph-scheduled-threads

—

Number of pending tasks in the GraphScheduledThreads thread pool.

active-graph-scheduled-threads

—

Number of active tasks in the GraphScheduledThreads thread pool.

completed-graph-scheduled-threads

—

Number of tasks completed by the GraphScheduledThreads thread pool.

pending-graph-system-threads

—

Number of pending tasks in the GraphSystemThreads thread pool.

active-graph-system-threads

—

Number of active tasks in the GraphSystemThreads thread pool.

completed-graph-system-threads

—

Number of tasks completed by the GraphSystemThreads thread pool.

pending-gremlin-worker-threads

—

Number of pending tasks in the GremlinWorkerThreads thread pool.

active-gremlin-worker-threads

—

Number of active tasks in the GremlinWorkerThreads thread pool.

completed-gremlin-worker-threads

—

Number of tasks completed by the GremlinWorkerThreads thread pool.

percentage-repaired

Percentage of data (uncompressed) marked as repaired across all non-system tables on a node. Tables with a replication factor of 1 are excluded.

read-coordinator-nonreplica

/sec

Rate of coordinated reads to a node where that node is not a replica for that partition.

read-coordinator-preferother

/sec

Rate of coordinated reads to a node where that node did not choose itself as a replica for the read request.

hints-on-disk

—

The number of hints currently stored on disk, to be replayed to peers.

hint-replay-success-rate

/sec

Rate of successful individual hint replays to peers. If one or more individual hints fail to replay in a batch, the successful hints in that batch are replayed again and double counted in this metric.

hint-replay-error-rate

/sec

Rate of failed individual hint replays. Replay of a single hint can fail more than once if retried.

hint-replay-timeout-rate

/sec

Rate of timed out individual hint replays. Replay of a single hint can timeout more than once if retried.

hint-replay-received-rate

/sec

Rate of successful individual hints replayed to this node, from other peers.

cross-node-latency

ms/op

The min, median, max, 90th, and 99th percentiles of the latency of messages between nodes. The time period starts when a node sends a message and ends when the current node receives it.

nodesync-data-repaired

bytes

Bytes of data that were inconsistent and needed synchronization.

nodesync-data-validated

bytes

Bytes of data checked for consistency.

nodesync-repair-data-sent

bytes

Total bytes of data transferred between all nodes during synchronization.

nodesync-objects-repaired

objects

Number of rows and range tombstones that were inconsistent and needed synchronization.

nodesync-objects-validated

objects

Number of rows and range tombstones checked for consistency.

nodesync-repair-objects-sent

objects

Total number of rows and range tombstones transferred between all nodes during synchronization.

nodesync-processed-pages

pages

Number of pages (internal groupings of data) processed.

nodesync-full-in-sync-pages

pages

Number of processed pages that were not in need of synchronization.

nodesync-full-repaired-pages

pages

Number of processed pages that were in need of synchronization.

nodesync-partial-in-sync-pages

pages

Number of in sync pages for which a response was gotten from only a partial number of replicas.

nodesync-partial-repaired-pages

pages

Number of repaired pages for which a response was gotten from only a partial number of replicas.

nodesync-uncompleted-pages

pages

Number of processed pages not having enough responses to perform synchronization.

nodesync-failed-pages

pages

Number of processed pages for which an unknown error prevented proper synchronization completion.

dropped-view-mutations

drops/sec

Mutation of Materialized View was seen after the timeout (write_request_timeout_in_ms) so was thrown away. This client might have timed out before it met the required consistency level, but might have succeeded as well. Hinted handoffs and read repairs should resolve inconsistencies but a repair can ensure it.

dropped-lwt

drops/sec

Lightweight Transaction was seen after the timeout (write_request_timeout_in_ms) so was thrown away. This client might have timed out before it met the required consistency level, but might have succeeded as well. Hinted handoffs and read repairs should resolve inconsistencies but a repair can ensure it.

dropped-hints

drops/sec

Hinted Handoff was seen after the timeout (write_request_timeout_in_ms) so was thrown away. Repairing the data or using NodeSync, should resolve data inconsistencies.

dropped-truncates

drops/sec

Truncate operation was seen after the timeout (truncate_request_timeout_in_ms) so was thrown away.

dropped-snapshots

drops/sec

Snapshot Request was seen after the timeout (request_timeout_in_ms) so was thrown away. Snapshot should be retried.

dropped-schemas

drops/sec

Schema change was seen after the timeout (request_timeout_in_ms) so was thrown away. Schema agreement may not have been reached immediately, but this will eventually resolve itself.

dropped-repairs

drops/sec

Repair message was seen after the timeout so was thrown away.

dropped-other

drops/sec

Miscellaneous message was seen after the timeout so was thrown away.

dropped-node-sync

drops/sec

Node-sync message was seen after the timeout so was thrown away.

dropped-batch-store

drops/sec

Batch store message was seen after the timeout so was thrown away.

Thread pool metrics keys

This list of keys corresponds to thread pool metrics collected by OpsCenter:

Key Description

Key	Description
pending-flushes	Number of memtables queued for the flush process. A flush sorts and writes the memtables to disk.
pending-gossip-stage	Number of gossip messages and acknowledgments queued and waiting to be sent or received.
pending-internal-response-stage	Number of pending tasks from internal tasks, such as nodes joining and leaving the cluster.
pending-anti-entropy-stage	Repair tasks pending, such as handling the merkle tree transfer after the validation compaction.
pending-cache-cleanup-stage	Tasks pending to clean row caches during a cleanup compaction.
pending-memtable-post-flush	Tasks related to the last step in flushing memtables to disk as SSTables. Includes removing unnecessary commitlog files and committing Solr-based secondary indexes.
pending-migration-stage	Number of pending tasks from system methods that modified the schema.
pending-misc-stage	Number of pending tasks from infrequently run operations, such as taking a snapshot or processing the notification of a completed replication.
pending-read-stage	Number of pending read requests. Read requests read data off of disk and deserialize cached data.
pending-read-repair-stage	Number of read repair operations in the queue waiting to run.
pending-request-response-stage	Number of pending callbacks to execute after a task on a remote node completes.
pending-mutation-stage	Number of write requests received by the cluster and waiting to be handled.
pending-validation-executor	Pending task to read data from sstables and generate a merkle tree for a repair.
pending-compaction-executor	Pending compactions that are known. This may deviate from `pending compactions` which includes an estimate of tasks that these pending tasks may create after completion.
pending-pending-range-calculator	Pending tasks to calculate the ranges according to bootsrapping and leaving nodes.
active-flushes	Up to memtable_flush_writers concurrent tasks to flush and write the memtables to disk.
active-gossip-stage	Number of gossip messages and acknowledgments actively being sent or received.
active-internal-response-stage	Number of active tasks from internal tasks, such as nodes joining and leaving the cluster.
active-anti-entropy-stage	Repair tasks active, such as handling the merkle tree transfer after the validation compaction.
active-cache-cleanup-stage	Tasks to clean row caches during a cleanup compaction.
active-memtable-post-flush	Tasks related to the last step in flushing memtables to disk as SSTables. Includes removing unnecessary commitlog files and committing Solr-based secondary indexes.
active-migration-stage	Number of active tasks from system methods that modified the schema.
active-misc-stage	Number of active tasks from infrequently run operations, such as taking a snapshot or processing the notification of a completed replication.
active-read-stage	Number of active read requests. Read requests read data off of disk and deserialize cached data.
active-read-repair-stage	Number of read repair operations actively being run.
active-request-response-stage	Number of callbacks to being executed after a task on a remote node is completed.
active-mutation-stage	Number of write requests being handled.
active-validation-executor	Active task to read data from sstables and generate a merkle tree for a repair.
active-compaction-executor	Active compactions that are known.
active-pending-range-calculator	Active tasks to calculate the ranges according to bootsrapping and leaving nodes.
completed-flushes	Number of memtables flushed to disk since the nodes start.
completed-gossip-stage	Number of gossip messages and acknowledgments recently sent or received.
completed-internal-response-stage	Number of recently completed tasks from internal tasks, such as nodes joining and leaving the cluster.
completed-anti-entropy-stage	Repair tasks recently completed, such as handling the merkle tree transfer after the validation compaction.
completed-cache-cleanup-stage	Tasks to clean row caches during a cleanup compaction.
completed-memtable-post-flush	Tasks related to the last step in flushing memtables to disk as SSTables. Includes removing unnecessary commitlog files and committing Solr-based secondary indexes.
completed-migration-stage	Number of completed tasks from system methods that modified the schema.
completed-misc-stage	Number of completed tasks from infrequently run operations, such as taking a snapshot or processing the notification of a completed replication.
completed-read-stage	Number of completed read requests. Read requests read data off of disk and deserialize cached data.
completed-read-repair-stage	Number of read repair operations recently completed.
completed-request-response-stage	Number of completed callbacks executed after a task on a remote node is completed.
completed-mutation-stage	Number of write requests received by the cluster that have been handled.
completed-validation-executor	Completed tasks to read data from sstables and generate a merkle tree for a repair.
completed-compaction-executor	Completed compactions.
completed-pending-range-calculator	Completed tasks to calculate the ranges according to bootsrapping and leaving nodes.
pending-counter-mutations	Pending tasks to execute local counter mutations.
active-counter-mutations	Up to concurrent_counter_writes running tasks that execute local counter mutations.
completed-counter-mutations	Number of local counter mutations that have been executed.
memtable-reclaim-pending	Waits for current reads to complete and then frees the memory formerly used by the obsoleted memtables.
memtable-reclaim-active	Waits for current reads to complete and then frees the memory formerly used by the obsoleted memtables.
completed-memtable-reclaim	Waits for current reads to complete and then frees the memory formerly used by the obsoleted memtables.
pending-view-mutation-stage	Number of mutations to apply locally after modifications to a base table.
active-view-mutation-stage	Number of mutations to being applied locally after modifications to a base table.
completed-view-mutation-stage	Number of mutations applied locally after modifications to a base table.
pending-hint-dispatcher	Pending tasks to send the stored hinted handoffs to a host.
active-hint-dispatcher	Up to max_hints_delivery_threads tasks, each dispatching all hinted handoffs to a host.
completed-hint-dispatcher	Number of tasks to transfer hints to a host that have completed.
pending-secondary-index-management	Any initialization work when a new index instance is created. This may involve costly operations such as (re)building the index.
active-secondary-index-management	Any initialization work when a new index instance is created. This may involve costly operations such as (re)building the index.
completed-secondary-index-management	Any initialization work when a new index instance is created. This may involve costly operations such as (re)building the index.
active-authentication	Authentication Active
completed-authentication	Authentication Completed
active-read-range	Read Range Active
completed-read-range	Read Range Completed
active-execute-statement	Execute Statement Active
completed-execute-statement	Execute Statement Completed
active-timed-speculate	Timed Speculate Active
completed-timed-speculate	Timed Speculate Completed
active-unknown	Unknown Active
completed-unknown	Unknown Completed
active-truncate	Truncate Active
completed-truncate	Truncate Completed
active-timed-histogram-aggregate	Timed Histogram Aggregate Active
completed-timed-histogram-aggregate	Timed Histogram Aggregate Completed
active-counter-acquire-lock	Counter Acquire Lock Active
completed-counter-acquire-lock	Counter Acquire Lock Completed
active-read	Read Active
completed-read	Read Completed
active-cas	CAS Active
completed-cas	CAS Completed
active-write-switch-for-memtable	Write Switch For Memtable Active
completed-write-switch-for-memtable	Write Switch For Memtable Completed
active-read-disk-async	Read Disk Async Active
completed-read-disk-async	Read Disk Async Completed
active-timed-unknown	Timed Unknown Active
completed-timed-unknown	Timed Unknown Completed
active-timed-meter-tick	Timed Meter Tick Active
completed-timed-meter-tick	Timed Meter Tick Completed
active-timed-timeout	Timed Timeout Active
completed-timed-timeout	Timed Timeout Completed
active-write	Write Active
completed-write	Write Completed
active-write-defragment	Write Defragment Active
completed-write-defragment	Write Defragment Completed
active-read-secondary-index	Read Secondary Index Active
completed-read-secondary-index	Read Secondary Index Completed
pending-read-range	Read Range Pending
total-blocked-read	Total Read Blocked
total-blocked-read-range	Total Read Range Blocked
total-blocked-write-defragment	Total Write Defragment Blocked
total-blocked-write	Total Write Blocked
pending-write-defragment	Write Defragment Pending
pending-write	Write Pending
pending-read	Read Pending
active-eventloop-spin	Eventloop Spin Active
completed-read-deferred	Read Deferred Completed
completed-authorization	Authorization Completed
completed-batch-replay	Batch Replay Completed
active-write-await-commitlog-segment	Write Await Commitlog Segment Active
active-eventloop-park	Eventloop Park Active
active-read-switch-for-response	Read Switch For Response Active
active-nodesync-validation	Nodesync Validation Active
active-read-switch-for-iterator	Read Switch For Iterator Active
active-batch-remove	Batch Remove Active
active-batch-replay	Batch Replay Active
active-read-range-switch-for-response	Read Range Switch For Response Active
active-write-switch-for-response	Write Switch For Response Active
completed-batch-remove	Batch Remove Completed
completed-batch-store-response	Batch Store Response Completed
active-write-memtable-full	Write Memtable Full Active
pending-lwt-propose	Lwt Propose Pending
active-write-await-commitlog-sync	Write Await Commitlog Sync Active
completed-nodesync-validation	Nodesync Validation Completed
completed-lwt-commit	Lwt Commit Completed
completed-read-switch-for-response	Read Switch For Response Completed
active-eventloop-yield	Eventloop Yield Active
active-lwt-prepare	Lwt Prepare Active
completed-lwt-propose	Lwt Propose Completed
pending-batch-store	Batch Store Pending
completed-read-switch-for-iterator	Read Switch For Iterator Completed
pending-lwt-prepare	Lwt Prepare Pending
completed-write-memtable-full	Write Memtable Full Completed
pending-truncate	Truncate Pending
pending-read-deferred	Read Deferred Pending
completed-eventloop-spin	Eventloop Spin Completed
completed-write-switch-for-response	Write Switch For Response Completed
completed-eventloop-park	Eventloop Park Completed
active-lwt-propose	Lwt Propose Active
completed-lwt-prepare	Lwt Prepare Completed
active-authorization	Authorization Active
completed-eventloop-yield	Eventloop Yield Completed
completed-batch-store	Batch Store Completed
active-batch-store	Batch Store Active
pending-batch-remove	Batch Remove Pending
active-lwt-commit	Lwt Commit Active
pending-lwt-commit	Lwt Commit Pending
completed-write-await-commitlog-segment	Write Await Commitlog Segment Completed
completed-read-range-switch-for-response	Read Range Switch For Response Completed
active-batch-store-response	Batch Store Response Active
completed-write-await-commitlog-sync	Write Await Commitlog Sync Completed
active-read-deferred	Read Deferred Active
total-blocked-batch-remove	Total Batch Remove Blocked
total-blocked-read-deferred	Total Read Deferred Blocked
total-blocked-lwt-commit	Total Lwt Commit Blocked
total-blocked-lwt-propose	Total Lwt Propose Blocked
total-blocked-truncate	Total Truncate Blocked
total-blocked-lwt-prepare	Total Lwt Prepare Blocked
total-blocked-batch-store	Total Batch Store Blocked

pending-flushes

Number of memtables queued for the flush process. A flush sorts and writes the memtables to disk.

pending-gossip-stage

Number of gossip messages and acknowledgments queued and waiting to be sent or received.

pending-internal-response-stage

Number of pending tasks from internal tasks, such as nodes joining and leaving the cluster.

pending-anti-entropy-stage

Repair tasks pending, such as handling the merkle tree transfer after the validation compaction.

pending-cache-cleanup-stage

Tasks pending to clean row caches during a cleanup compaction.

pending-memtable-post-flush

Tasks related to the last step in flushing memtables to disk as SSTables. Includes removing unnecessary commitlog files and committing Solr-based secondary indexes.

pending-migration-stage

Number of pending tasks from system methods that modified the schema.

pending-misc-stage

Number of pending tasks from infrequently run operations, such as taking a snapshot or processing the notification of a completed replication.

pending-read-stage

Number of pending read requests. Read requests read data off of disk and deserialize cached data.

pending-read-repair-stage

Number of read repair operations in the queue waiting to run.

pending-request-response-stage

Number of pending callbacks to execute after a task on a remote node completes.

pending-mutation-stage

Number of write requests received by the cluster and waiting to be handled.

pending-validation-executor

Pending task to read data from sstables and generate a merkle tree for a repair.

pending-compaction-executor

Pending compactions that are known. This may deviate from pending compactions which includes an estimate of tasks that these pending tasks may create after completion.

pending-pending-range-calculator

Pending tasks to calculate the ranges according to bootsrapping and leaving nodes.

active-flushes

Up to memtable_flush_writers concurrent tasks to flush and write the memtables to disk.

active-gossip-stage

Number of gossip messages and acknowledgments actively being sent or received.

active-internal-response-stage

Number of active tasks from internal tasks, such as nodes joining and leaving the cluster.

active-anti-entropy-stage

Repair tasks active, such as handling the merkle tree transfer after the validation compaction.

active-cache-cleanup-stage

Tasks to clean row caches during a cleanup compaction.

active-memtable-post-flush

Tasks related to the last step in flushing memtables to disk as SSTables. Includes removing unnecessary commitlog files and committing Solr-based secondary indexes.

active-migration-stage

Number of active tasks from system methods that modified the schema.

active-misc-stage

Number of active tasks from infrequently run operations, such as taking a snapshot or processing the notification of a completed replication.

active-read-stage

Number of active read requests. Read requests read data off of disk and deserialize cached data.

active-read-repair-stage

Number of read repair operations actively being run.

active-request-response-stage

Number of callbacks to being executed after a task on a remote node is completed.

active-mutation-stage

Number of write requests being handled.

active-validation-executor

Active task to read data from sstables and generate a merkle tree for a repair.

active-compaction-executor

Active compactions that are known.

active-pending-range-calculator

Active tasks to calculate the ranges according to bootsrapping and leaving nodes.

completed-flushes

Number of memtables flushed to disk since the nodes start.

completed-gossip-stage

Number of gossip messages and acknowledgments recently sent or received.

completed-internal-response-stage

Number of recently completed tasks from internal tasks, such as nodes joining and leaving the cluster.

completed-anti-entropy-stage

Repair tasks recently completed, such as handling the merkle tree transfer after the validation compaction.

completed-cache-cleanup-stage

Tasks to clean row caches during a cleanup compaction.

completed-memtable-post-flush

Tasks related to the last step in flushing memtables to disk as SSTables. Includes removing unnecessary commitlog files and committing Solr-based secondary indexes.

completed-migration-stage

Number of completed tasks from system methods that modified the schema.

completed-misc-stage

Number of completed tasks from infrequently run operations, such as taking a snapshot or processing the notification of a completed replication.

completed-read-stage

Number of completed read requests. Read requests read data off of disk and deserialize cached data.

completed-read-repair-stage

Number of read repair operations recently completed.

completed-request-response-stage

Number of completed callbacks executed after a task on a remote node is completed.

completed-mutation-stage

Number of write requests received by the cluster that have been handled.

completed-validation-executor

Completed tasks to read data from sstables and generate a merkle tree for a repair.

completed-compaction-executor

Completed compactions.

completed-pending-range-calculator

Completed tasks to calculate the ranges according to bootsrapping and leaving nodes.

pending-counter-mutations

Pending tasks to execute local counter mutations.

active-counter-mutations

Up to concurrent_counter_writes running tasks that execute local counter mutations.

completed-counter-mutations

Number of local counter mutations that have been executed.

memtable-reclaim-pending

Waits for current reads to complete and then frees the memory formerly used by the obsoleted memtables.

memtable-reclaim-active

Waits for current reads to complete and then frees the memory formerly used by the obsoleted memtables.

completed-memtable-reclaim

Waits for current reads to complete and then frees the memory formerly used by the obsoleted memtables.

pending-view-mutation-stage

Number of mutations to apply locally after modifications to a base table.

active-view-mutation-stage

Number of mutations to being applied locally after modifications to a base table.

completed-view-mutation-stage

Number of mutations applied locally after modifications to a base table.

pending-hint-dispatcher

Pending tasks to send the stored hinted handoffs to a host.

active-hint-dispatcher

Up to max_hints_delivery_threads tasks, each dispatching all hinted handoffs to a host.

completed-hint-dispatcher

Number of tasks to transfer hints to a host that have completed.

pending-secondary-index-management

Any initialization work when a new index instance is created. This may involve costly operations such as (re)building the index.

active-secondary-index-management

Any initialization work when a new index instance is created. This may involve costly operations such as (re)building the index.

completed-secondary-index-management

Any initialization work when a new index instance is created. This may involve costly operations such as (re)building the index.

active-authentication

Authentication Active

completed-authentication

Authentication Completed

active-read-range

Read Range Active

completed-read-range

Read Range Completed

active-execute-statement

Execute Statement Active

completed-execute-statement

Execute Statement Completed

active-timed-speculate

Timed Speculate Active

completed-timed-speculate

Timed Speculate Completed

active-unknown

Unknown Active

completed-unknown

Unknown Completed

active-truncate

Truncate Active

completed-truncate

Truncate Completed

active-timed-histogram-aggregate

Timed Histogram Aggregate Active

completed-timed-histogram-aggregate

Timed Histogram Aggregate Completed

active-counter-acquire-lock

Counter Acquire Lock Active

completed-counter-acquire-lock

Counter Acquire Lock Completed

active-read

Read Active

completed-read

Read Completed

active-cas

CAS Active

completed-cas

CAS Completed

active-write-switch-for-memtable

Write Switch For Memtable Active

completed-write-switch-for-memtable

Write Switch For Memtable Completed

active-read-disk-async

Read Disk Async Active

completed-read-disk-async

Read Disk Async Completed

active-timed-unknown

Timed Unknown Active

completed-timed-unknown

Timed Unknown Completed

active-timed-meter-tick

Timed Meter Tick Active

completed-timed-meter-tick

Timed Meter Tick Completed

active-timed-timeout

Timed Timeout Active

completed-timed-timeout

Timed Timeout Completed

active-write

Write Active

completed-write

Write Completed

active-write-defragment

Write Defragment Active

completed-write-defragment

Write Defragment Completed

active-read-secondary-index

Read Secondary Index Active

completed-read-secondary-index

Read Secondary Index Completed

pending-read-range

Read Range Pending

total-blocked-read

Total Read Blocked

total-blocked-read-range

Total Read Range Blocked

total-blocked-write-defragment

Total Write Defragment Blocked

total-blocked-write

Total Write Blocked

pending-write-defragment

Write Defragment Pending

pending-write

Write Pending

pending-read

Read Pending

active-eventloop-spin

Eventloop Spin Active

completed-read-deferred

Read Deferred Completed

completed-authorization

Authorization Completed

completed-batch-replay

Batch Replay Completed

active-write-await-commitlog-segment

Write Await Commitlog Segment Active

active-eventloop-park

Eventloop Park Active

active-read-switch-for-response

Read Switch For Response Active

active-nodesync-validation

Nodesync Validation Active

active-read-switch-for-iterator

Read Switch For Iterator Active

active-batch-remove

Batch Remove Active

active-batch-replay

Batch Replay Active

active-read-range-switch-for-response

Read Range Switch For Response Active

active-write-switch-for-response

Write Switch For Response Active

completed-batch-remove

Batch Remove Completed

completed-batch-store-response

Batch Store Response Completed

active-write-memtable-full

Write Memtable Full Active

pending-lwt-propose

Lwt Propose Pending

active-write-await-commitlog-sync

Write Await Commitlog Sync Active

completed-nodesync-validation

Nodesync Validation Completed

completed-lwt-commit

Lwt Commit Completed

completed-read-switch-for-response

Read Switch For Response Completed

active-eventloop-yield

Eventloop Yield Active

active-lwt-prepare

Lwt Prepare Active

completed-lwt-propose

Lwt Propose Completed

pending-batch-store

Batch Store Pending

completed-read-switch-for-iterator

Read Switch For Iterator Completed

pending-lwt-prepare

Lwt Prepare Pending

completed-write-memtable-full

Write Memtable Full Completed

pending-truncate

Truncate Pending

pending-read-deferred

Read Deferred Pending

completed-eventloop-spin

Eventloop Spin Completed

completed-write-switch-for-response

Write Switch For Response Completed

completed-eventloop-park

Eventloop Park Completed

active-lwt-propose

Lwt Propose Active

completed-lwt-prepare

Lwt Prepare Completed

active-authorization

Authorization Active

completed-eventloop-yield

Eventloop Yield Completed

completed-batch-store

Batch Store Completed

active-batch-store

Batch Store Active

pending-batch-remove

Batch Remove Pending

active-lwt-commit

Lwt Commit Active

pending-lwt-commit

Lwt Commit Pending

completed-write-await-commitlog-segment

Write Await Commitlog Segment Completed

completed-read-range-switch-for-response

Read Range Switch For Response Completed

active-batch-store-response

Batch Store Response Active

completed-write-await-commitlog-sync

Write Await Commitlog Sync Completed

active-read-deferred

Read Deferred Active

total-blocked-batch-remove

Total Batch Remove Blocked

total-blocked-read-deferred

Total Read Deferred Blocked

total-blocked-lwt-commit

Total Lwt Commit Blocked

total-blocked-lwt-propose

Total Lwt Propose Blocked

total-blocked-truncate

Total Truncate Blocked

total-blocked-lwt-prepare

Total Lwt Prepare Blocked

total-blocked-batch-store

Total Batch Store Blocked

Table metrics keys

This list of keys corresponds to table-specific metrics collected by OpsCenter:

Key	Units	Description
cf-write-ops	/sec	Local write requests per second. Local writes update the table’s memtable and appends to a commitlog.
cf-local-write-latency	ms/op	The min, median, max, 90th, and 99th percentile of the response times to write data to a table’s memtable. The elapsed time from when the replica receives the request from a coordinator and returns a response.
cf-read-ops	/sec	Local read requests per second. Local reads retrieve data from a table’s memtable and any necessary SSTables on disk.
cf-local-read-latency	ms/op	The min, median, max, 90th, and 99th percentile of the response time to read data from the memtable and sstables for a specific table. The elapsed time from when the replica receives the request from a coordinator and returns a response.
cf-live-disk-used	—	Disk space used by live SSTables. There might be obsolete SSTables not included.
cf-total-disk-used	—	Disk space used by a table by SSTables, including obsolete ones waiting to be garbage collected.
cf-live-sstables	—	Total number of SSTables for a table.
cf-sstables-per-read	sstables	The min, median, max, 90th, and 99th percentile of how many SSTables are accessed during a read. Includes sstables that undergo bloom-filter checks, even if no data is read from the sstable.
cf-partition-size		The min, median, max, 90th, and 99th percentile of the size (in bytes) of partitions of this table.
cf-column-count	cells	The min, median, max, 90th, and 99th percentile of how many cells exist in partitions for this table.
cf-bf-space-used	—	The total size of all the SSTables' bloom filters for this table.
cf-bf-false-positives	/sec	Number of bloom filter false positives per second.
cf-bf-false-ratio	—	Percentage of bloom filter lookups that resulted in a false positive.
solr-requests	/sec	Requests per second made to a specific Solr core/index.
solr-avg-time-per-req	ms/request	Average time a search query takes in a DSE cluster using DSE Search.
solr-errors	/sec	Errors per second that occur for a specific Solr core/index.
solr-timeouts	/sec	Timeouts per second on a specific Solr core/index.
solr-index-size	KB	Size of the Solr core on disk.
cf-sstable-size	—	—
cf-speculative-retries	retries	Number of speculative retries for this table.
cf-bf-offheap	—	Total off heap memory used by bloom filters from all live SSTables in a table.
cf-index-summary-offheap	—	Total off heap memory used by the index summary of all live SSTables in a table.
cf-compression-data-offheap	—	Total off heap memory used by the compression metadata of all live SSTables in a table.
cf-memtable-offheap	—	Off heap memory used by a table’s current memtable.
cf-all-memtables-heapsize	—	An estimate of the space used in JVM heap memory for all memtables. This includes ones that are currently being flushed and related secondary indexes.
cf-all-memtables-livedatasize	—	An estimate of the space used for 'live data' (off-heap, excluding overhead) for all memtables. This includes ones that are currently being flushed and related secondary indexes.
cf-all-memtables-offheapsize	—	An estimate of the space used in off-heap memory for all memtables. This includes ones that are currently being flushed and related secondary indexes.
cf-row-size	—	Approximate number of partitions. This may be off given duplicates in memtables and sstables are both counted and there is a very small error percentage inherited from the HyperLogLog data structure.
cf-tombstones-per-read	tombstones	The min, median, max, 90th, and 99th percentile of how many tombstones are read during a read.
cf-write-latency-legacy	ms/op	Deprecated. Median response time to write data to a table’s memtable. The elapsed time from when the replica receives the request from a coordinator and returns a response.
cf-read-latency-legacy	ms/op	Deprecated. Median response time to read data from the memtable and SSTables for a specific table. The elapsed time from when the replica receives the request from a coordinator and returns a response.
cf-coordinator-read-latency	ms/op	The min, median, max, 90th, and 99th percentiles of client reads on this table. The time period starts when a node receives a client read request, and ends when the node responds back to the client. Depending on consistency level and replication factor, this may include the network latency from requesting the data’s replicas.
cf-coordinator-read-ops	/sec	The number of read requests per second for a particular table on the coordinator nodes. Monitoring the number of requests over a given time period reveals table read workload and usage patterns.
cf-cells-scanned-during-read	cells	The min, median, max, 90th, and 99th percentile of how many cells were scanned during a read.
cf-tier-size	—	Disk space used by a table by SSTables for the tier.
cf-tier-sstables	sstables	Number of SSTables in a tier for a table.
cf-tier-max-data-age	—	Timestamp in local server time that represents an upper bound to the newest piece of data stored in the SSTable. When a new SSTable is flushed, it is set to the time of creation. When an SSTable is created from compaction, it is set to the max of all merged SSTables.
cf-percentage-repaired	%	Percentage of data (uncompressed) marked as repired for a given table on a node. This metric is only meaningful for replication factor > 1.
nodesync-tbl-data-repaired	bytes	Bytes of data that were inconsistent and needed synchronization.
nodesync-tbl-data-validated	bytes	Bytes of data checked for consistency.
nodesync-tbl-repair-data-sent	bytes	Total bytes of data transferred between all nodes during synchronization.
nodesync-tbl-objects-repaired	objects	Number of rows and range tombstones that were inconsistent and needed synchronization.
nodesync-tbl-objects-validated	objects	Number of rows and range tombstones checked for consistency.
nodesync-tbl-repair-objects-sent	objects	Total number of rows and range tombstones transferred between all nodes during synchronization.
nodesync-tbl-processed-pages	pages	Number of pages (internal groupings of data) processed.
nodesync-tbl-full-in-sync-pages	pages	Number of processed pages that were not in need of synchronization.
nodesync-tbl-full-repaired-pages	pages	Number of processed pages that were in need of synchronization.
nodesync-tbl-partial-in-sync-pages	pages	Number of in sync pages for which a response was gotten from only a partial number of replicas.
nodesync-tbl-partial-repaired-pages	pages	Number of repaired pages for which a response was gotten from only a partial number of replicas.
nodesync-tbl-uncompleted-pages	pages	Number of processed pages not having enough responses to perform synchronization.
nodesync-tbl-failed-pages	pages	Number of processed pages for which an unknown error prevented proper synchronization completion.

Key

Units

Description

cf-write-ops

/sec

Local write requests per second. Local writes update the table’s memtable and appends to a commitlog.

cf-local-write-latency

ms/op

The min, median, max, 90th, and 99th percentile of the response times to write data to a table’s memtable. The elapsed time from when the replica receives the request from a coordinator and returns a response.

cf-read-ops

/sec

Local read requests per second. Local reads retrieve data from a table’s memtable and any necessary SSTables on disk.

cf-local-read-latency

ms/op

The min, median, max, 90th, and 99th percentile of the response time to read data from the memtable and sstables for a specific table. The elapsed time from when the replica receives the request from a coordinator and returns a response.

cf-live-disk-used

—

Disk space used by live SSTables. There might be obsolete SSTables not included.

cf-total-disk-used

—

Disk space used by a table by SSTables, including obsolete ones waiting to be garbage collected.

cf-live-sstables

—

Total number of SSTables for a table.

cf-sstables-per-read

sstables

The min, median, max, 90th, and 99th percentile of how many SSTables are accessed during a read. Includes sstables that undergo bloom-filter checks, even if no data is read from the sstable.

cf-partition-size

The min, median, max, 90th, and 99th percentile of the size (in bytes) of partitions of this table.

cf-column-count

cells

The min, median, max, 90th, and 99th percentile of how many cells exist in partitions for this table.

cf-bf-space-used

—

The total size of all the SSTables' bloom filters for this table.

cf-bf-false-positives

/sec

Number of bloom filter false positives per second.

cf-bf-false-ratio

—

Percentage of bloom filter lookups that resulted in a false positive.

solr-requests

/sec

Requests per second made to a specific Solr core/index.

solr-avg-time-per-req

ms/request

Average time a search query takes in a DSE cluster using DSE Search.

solr-errors

/sec

Errors per second that occur for a specific Solr core/index.

solr-timeouts

/sec

Timeouts per second on a specific Solr core/index.

solr-index-size

Size of the Solr core on disk.

cf-sstable-size

—

cf-speculative-retries

retries

Number of speculative retries for this table.

cf-bf-offheap

—

Total off heap memory used by bloom filters from all live SSTables in a table.

cf-index-summary-offheap

—

Total off heap memory used by the index summary of all live SSTables in a table.

cf-compression-data-offheap

—

Total off heap memory used by the compression metadata of all live SSTables in a table.

cf-memtable-offheap

—

Off heap memory used by a table’s current memtable.

cf-all-memtables-heapsize

—

An estimate of the space used in JVM heap memory for all memtables. This includes ones that are currently being flushed and related secondary indexes.

cf-all-memtables-livedatasize

—

An estimate of the space used for 'live data' (off-heap, excluding overhead) for all memtables. This includes ones that are currently being flushed and related secondary indexes.

cf-all-memtables-offheapsize

—

An estimate of the space used in off-heap memory for all memtables. This includes ones that are currently being flushed and related secondary indexes.

cf-row-size

—

Approximate number of partitions. This may be off given duplicates in memtables and sstables are both counted and there is a very small error percentage inherited from the HyperLogLog data structure.

cf-tombstones-per-read

tombstones

The min, median, max, 90th, and 99th percentile of how many tombstones are read during a read.

cf-write-latency-legacy

ms/op

Deprecated. Median response time to write data to a table’s memtable. The elapsed time from when the replica receives the request from a coordinator and returns a response.

cf-read-latency-legacy

ms/op

Deprecated. Median response time to read data from the memtable and SSTables for a specific table. The elapsed time from when the replica receives the request from a coordinator and returns a response.

cf-coordinator-read-latency

ms/op

The min, median, max, 90th, and 99th percentiles of client reads on this table. The time period starts when a node receives a client read request, and ends when the node responds back to the client. Depending on consistency level and replication factor, this may include the network latency from requesting the data’s replicas.

cf-coordinator-read-ops

/sec

The number of read requests per second for a particular table on the coordinator nodes. Monitoring the number of requests over a given time period reveals table read workload and usage patterns.

cf-cells-scanned-during-read

cells

The min, median, max, 90th, and 99th percentile of how many cells were scanned during a read.

cf-tier-size

—

Disk space used by a table by SSTables for the tier.

cf-tier-sstables

sstables

Number of SSTables in a tier for a table.

cf-tier-max-data-age

—

Timestamp in local server time that represents an upper bound to the newest piece of data stored in the SSTable. When a new SSTable is flushed, it is set to the time of creation. When an SSTable is created from compaction, it is set to the max of all merged SSTables.

cf-percentage-repaired

Percentage of data (uncompressed) marked as repired for a given table on a node. This metric is only meaningful for replication factor > 1.

nodesync-tbl-data-repaired

bytes

Bytes of data that were inconsistent and needed synchronization.

nodesync-tbl-data-validated

bytes

Bytes of data checked for consistency.

nodesync-tbl-repair-data-sent

bytes

Total bytes of data transferred between all nodes during synchronization.

nodesync-tbl-objects-repaired

objects

Number of rows and range tombstones that were inconsistent and needed synchronization.

nodesync-tbl-objects-validated

objects

Number of rows and range tombstones checked for consistency.

nodesync-tbl-repair-objects-sent

objects

Total number of rows and range tombstones transferred between all nodes during synchronization.

nodesync-tbl-processed-pages

pages

Number of pages (internal groupings of data) processed.

nodesync-tbl-full-in-sync-pages

pages

Number of processed pages that were not in need of synchronization.

nodesync-tbl-full-repaired-pages

pages

Number of processed pages that were in need of synchronization.

nodesync-tbl-partial-in-sync-pages

pages

Number of in sync pages for which a response was gotten from only a partial number of replicas.

nodesync-tbl-partial-repaired-pages

pages

Number of repaired pages for which a response was gotten from only a partial number of replicas.

nodesync-tbl-uncompleted-pages

pages

Number of processed pages not having enough responses to perform synchronization.

nodesync-tbl-failed-pages

pages

Number of processed pages for which an unknown error prevented proper synchronization completion.

Storage tier metrics keys

This list of keys corresponds storage tier-specific metrics collected by OpsCenter:

Key	Units	Description
cf-tier-size	—	Disk space used by a table by SSTables for the tier.
cf-tier-sstables	sstables	Number of SSTables in a tier for a table.
cf-tier-max-data-age	—	Timestamp in local server time that represents an upper bound to the newest piece of data stored in the SSTable. When a new SSTable is flushed, it is set to the time of creation. When an SSTable is created from compaction, it is set to the max of all merged SSTables.

Key

Units

Description

cf-tier-size

—

Disk space used by a table by SSTables for the tier.

cf-tier-sstables

sstables

Number of SSTables in a tier for a table.

cf-tier-max-data-age

—

Operating system metrics keys

This list of keys corresponds to operating system (OS) metrics collected by OpsCenter:

Key	OS	Units	Description
os-memory	linux	MB	Stacked graph of used, cached, and free memory.
os-memory-osx	osx	MB	Stacked graph of used and free memory.
os-memory-free	linux, osx	MB	Total system memory currently free.
os-memory-used	linux, osx	MB	Total system memory currently used.
os-memory-shared	linux	MB	Total amount of memory in shared memory space.
os-memory-buffers	linux	MB	Total system memory currently buffered.
os-memory-cached	linux	MB	Total system memory currently cached.
os-memory-win	windows	MB	Stacked graph of committed, cached, paged, non-paged, and free memory.
os-memory-avail	windows	MB	Available physical memory.
os-memory-committed	windows	MB	Memory in use by the operating system.
os-memory-pool-paged	windows	MB	Allocated pool-paged-resident memory.
os-memory-pool-nonpaged	windows	MB	Allocated pool-nonpaged memory.
os-memory-sys-cache-resident	windows	MB	Memory used by the file cache.
cpu	linux	—	Stacked graph of iowait, steal, nice, system, user, and idle CPU usage.
cpu-osx	osx	—	Stacked graph of idle, user, and system CPU usage.
cpu-win	windows	—	Stacked graph of user, privileged, and idle CPU usage.
os-cpu-user	—	—	Time the CPU devotes to user processes.
os-cpu-system	linux, osx	—	Time the CPU devotes to system processes.
os-cpu-idle	—	—	Time the CPU is idle.
os-cpu-iowait	linux	—	Time the CPU devotes to waiting for I/O to complete.
os-cpu-steal	linux	—	Time the CPU devotes to tasks stolen by virtual operating systems.
os-cpu-nice	linux	—	Time the CPU devotes to processing nice tasks.
os-cpu-privileged	windows	—	Time the CPU devotes to processing privileged instructions.
os-load	—	—	Operating system load average. One minute value parsed from /proc/loadavg on Linux systems.
os-disk-usage	—	—	Disk space used by Cassandra at a given time.
os-disk-free	—	GB	Free space on a specific disk partition.
os-disk-used	—	GB	Disk space used by Cassandra at a given time.
os-disk-read-throughput	linux, windows	MB/sec	Average disk throughput for read operations.
os-disk-write-throughput	linux, windows	MB/sec	Average disk throughput for write operations.
os-disk-throughput	osx	MB/sec	Average disk throughput for read and write operations.
os-disk-read-rate	linux, windows	/sec	Rate of reads per second to the disk.
os-disk-write-rate	linux, windows	/sec	Rate of writes per second to the disk.
os-disk-await	linux, windows	ms	Average completion time of each request to the disk.
os-disk-request-size	linux, osx	sectors	Average size of read requests issued to the disk.
os-disk-request-size-kb	windows	KB	Average size of read requests issued to the disk.
os-disk-queue-size	linux, windows	requests	Average number of requests queued due to disk latency issues.
os-disk-utilization	linux, windows	—	CPU time consumed by disk I/O.
os-net-received	—	KB/sec	Speed of data received from the network.
os-net-sent	—	KB/sec	Speed of data sent across the network.
os-disk-space	—	GB	—
os-disk-throughput-grouped	linux, windows	MB/sec	—
os-disk-rate	linux, windows	/sec	—
os-net-traffic	—	KB/sec	—
os-net-sent-win	—	—	Speed of data sent across the network.
os-net-received-win	—	—	Speed of data received from the network.

Key

Units

Description

os-memory

linux

Stacked graph of used, cached, and free memory.

os-memory-osx

osx

Stacked graph of used and free memory.

os-memory-free

linux, osx

Total system memory currently free.

os-memory-used

linux, osx

Total system memory currently used.

os-memory-shared

linux

Total amount of memory in shared memory space.

os-memory-buffers

linux

Total system memory currently buffered.

os-memory-cached

linux

Total system memory currently cached.

os-memory-win

windows

Stacked graph of committed, cached, paged, non-paged, and free memory.

os-memory-avail

windows

Available physical memory.

os-memory-committed

windows

Memory in use by the operating system.

os-memory-pool-paged

windows

Allocated pool-paged-resident memory.

os-memory-pool-nonpaged

windows

Allocated pool-nonpaged memory.

os-memory-sys-cache-resident

windows

Memory used by the file cache.

cpu

linux

—

Stacked graph of iowait, steal, nice, system, user, and idle CPU usage.

cpu-osx

osx

—

Stacked graph of idle, user, and system CPU usage.

cpu-win

windows

—

Stacked graph of user, privileged, and idle CPU usage.

os-cpu-user

—

Time the CPU devotes to user processes.

os-cpu-system

linux, osx

—

Time the CPU devotes to system processes.

os-cpu-idle

—

Time the CPU is idle.

os-cpu-iowait

linux

—

Time the CPU devotes to waiting for I/O to complete.

os-cpu-steal

linux

—

Time the CPU devotes to tasks stolen by virtual operating systems.

os-cpu-nice

linux

—

Time the CPU devotes to processing nice tasks.

os-cpu-privileged

windows

—

Time the CPU devotes to processing privileged instructions.

os-load

—

Operating system load average. One minute value parsed from /proc/loadavg on Linux systems.

os-disk-usage

—

Disk space used by Cassandra at a given time.

os-disk-free

—

Free space on a specific disk partition.

os-disk-used

—

Disk space used by Cassandra at a given time.

os-disk-read-throughput

linux, windows

MB/sec

Average disk throughput for read operations.

os-disk-write-throughput

linux, windows

MB/sec

Average disk throughput for write operations.

os-disk-throughput

osx

MB/sec

Average disk throughput for read and write operations.

os-disk-read-rate

linux, windows

/sec

Rate of reads per second to the disk.

os-disk-write-rate

linux, windows

/sec

Rate of writes per second to the disk.

os-disk-await

linux, windows

Average completion time of each request to the disk.

os-disk-request-size

linux, osx

sectors

Average size of read requests issued to the disk.

os-disk-request-size-kb

windows

Average size of read requests issued to the disk.

os-disk-queue-size

linux, windows

requests

Average number of requests queued due to disk latency issues.

os-disk-utilization

linux, windows

—

CPU time consumed by disk I/O.

os-net-received

—

KB/sec

Speed of data received from the network.

os-net-sent

—

KB/sec

Speed of data sent across the network.

os-disk-space

—

os-disk-throughput-grouped

linux, windows

MB/sec

—

os-disk-rate

linux, windows

/sec

—

os-net-traffic

—

KB/sec

—

os-net-sent-win

—

Speed of data sent across the network.

os-net-received-win

—

Speed of data received from the network.

Retrieve metric data

Control the metric data output

GET /{cluster_id}/cluster-metrics/{dc}/{metric}

GET /{cluster_id}/cluster-metrics/{dc}/{metric}/{device}

GET /{cluster_id}/cluster-metrics/{dc}/{ks_name}/{cf_name}/{metric}

GET /{cluster_id}/metrics/{node_ip}/{metric}

GET /{cluster_id}/metrics/{node_ip}/{metric}/{device}

GET /{cluster_id}/metrics/{node_ip}/{ks_name}/{cf_name}/{metric}

GET /{cluster_id}/cluster-metrics/{dc}/{metric}

GET /{cluster_id}/new-metrics

Metrics attribute key lists

Cluster metrics keys

Thread pool metrics keys

Table metrics keys

Storage tier metrics keys

Operating system metrics keys

Was this helpful?

Give Feedback