Statistics gathered for user activity 

User activity data is stored in latency-order to quickly identify latency in the system and by user to retrieve statistics for a particular client connection.

User activity data is stored in two main ways: Latency-ordered for quickly identifying the hot spots in the system and by user to retrieve statistics for a particular client connection.

To identify which users are currently experiencing highest average latencies on a given node, you can query these tables:
  • user_read_io_snapshot
  • user_write_io_snapshot

These tables record mean the read/write latencies and total read/write counts per-user on each node. They are ordered by their mean latency values, so you can quickly see which users are the experiencing the highest average latencies on a given node. Having identified the users experiencing the highest latency on a node, you can then can drill down to find the hot spots for those clients.

To do this, query the user_object_read_io_snapshot and user_object_write_io_snapshot tables. These tables store mean read/write latency and total read/write count by table for the specified user. They are ordered according to the mean latency values, and therefore able to quickly show for a given user which tables are contributing most to the experienced latencies.

The data in these tables is refreshed periodically (by default every 10 seconds), so querying them always provides an up-to-date view of the data objects with the highest mean latencies on a given node. Because this is time-sensitive data, if a user performs no activity for a period, no data is recorded for them in these tables.

The user_object_io table also reports per-node user activity broken down by keyspace/table and retains it over a longer period (4 hours by default). This allows the Performance Service to query by node and user to see latency metrics from all tables or restricted to a single keyspace or table. The data in this table is updated periodically (again every 10 seconds by default).

The user_io table reports aggregate latency metrics for users on a single node. Using this table, you can query by node and user to see high-level latency statistics across all keyspaces.