nodetool tpstats

Provides usage statistics of thread pools.

Provides usage statistics of thread pools.

Synopsis

nodetool <options> tpstats
Table 1. Options
Short Long Description
-h --host Hostname or IP address
-p --port Port number
-pwf --password-file Password file path
-pw --password Password
-u --username User name
-- Separates an option from an argument that could be mistaken for a option.

Description

Cassandra is based on a Staged Event Driven Architecture (SEDA). Different tasks are separated into stages that are connected by a messaging service. Stages have a queue and thread pool. Some stages skip the messaging service and queue tasks immediately on a different stage when it exists on the same node. The queues can back up if executing at the next stage is too busy and cause performance bottlenecks.

The nodetool tpstats command provides statistics about the number of active, pending, and completed tasks for each stage of Cassandra operations by thread pool. It's updated when SSTables change through compaction or flushing.

Run the nodetool tpstats command on a local node to get thread pool statistics. This table describes key indicators:

Table 2. nodetool tpstats output
Name of statistic Task Related information
AntiEntropyStage Repair consistency Nodetool repair
CacheCleanupExecutor Clears the cache
CommitlogArchiver Archives commitlog
CompactionExecutor Runs compaction
CounterMutationStage Local counter changes Will back up if the write rate exceeds the mutation rate. A high pending count will be seen if consistency level is set to ONE and there is a high counter increment workload.
GossipStage Handle gossip rounds every second Out of sync schemas can cause issues. nodetool resetlocalschema may need to be used.
HintedHandoff Send missed mutations to other nodes Usually symptom of a problem elsewhere. Use nodetool disablehandoff and run repair.
InternalResponseStage Respond to non-client initiated messages, including bootstrapping and schema checking
MemtableFlushWriter Writes memtable contents to disk Will back up if the queue is overrunning the disk I/O capabilities. Sorting can also cause issues if the queue has a high load associated with a small number of flushes. Cause can be huge rows with large column names or inserting too many values into a CQL collection. For disk issues, add nodes or tune configuration.
Warning:
nodetool tpstats
does not report blocked threads in the MemtableFlushWriter pool.
MemtablePostFlush Operations after flushing the memtable Discard commit log files and flush secondary indexes.
MemtableReclaimMemory Makes unused memory available
MigrationStage Make schema changes
MiscStage Miscellaneous operations Snapshotting, replicating data after node remove completed.
MutationStage Local writes A high number of pending write requests indicates a problem handling them. Adding a node, tuning hardware and configuration, or updating data models will improve handling.
Native-Transport-Requests Requests to the server using the CQL Native Protocol
PendingRangeCalculator Calculate pending ranges per bootstraps and departed nodes Developer notes
ReadRepairStage A digest query and update of replicas of a key Fast providing good connectivity between replicas exists. If pending grows too large, attempt to lower the rate for high-read tables by altering the table to use a smaller read_repair_chance value, like 0.11.
ReadStage Local reads Performing a local read. Also includes deserializing data from row cache. Pending values can cause increased read latency. Generally resolved by adding nodes or tuning the system.
RequestResponseStage Handle responses from other nodes
ValidationExecutor Validates schema
Table 3. Droppable Messages
Message Type Stage Notes
BINARY n/a This is deprecated and no longer has any use
_TRACE n/a (special) Used for recording traces (nodetool settraceprobability) Has a special executor (1 thread, 1000 queue depth) that throws away messages on insertion instead of within the execute
MUTATION MutationStage If a write message is processed after its timeout (write_request_timeout_in_ms) it either sent a failure to the client or it met its requested consistency level and will relay on hinted handoff and read repairs to do the mutation if it succeeded.
COUNTER_MUTATION MutationStage If a write message is processed after its timeout (write_request_timeout_in_ms) it either sent a failure to the client or it met its requested consistency level and will relay on hinted handoff and read repairs to do the mutation if it succeeded.
READ_REPAIR MutationStage Times out after write_request_timeout_in_ms
READ ReadStage Times out after read_request_timeout_in_ms. No point in servicing reads after that point since it would of returned error to client
RANGE_SLICE ReadStage Times out after range_request_timeout_in_ms.
PAGED_RANGE ReadStage Times out after request_timeout_in_ms.
REQUEST_RESPONSE RequestResponseStage Times out after request_timeout_in_ms. Response was completed and sent back but not before the timeout

Example

Run the command every two seconds.

nodetool -h labcluster tpstats

Example output is:

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
CounterMutationStage              0         0              0         0                 0
ReadStage                         0         0            103         0                 0
RequestResponseStage              0         0              0         0                 0
MutationStage                     0         0       13234794         0                 0
ReadRepairStage                   0         0              0         0                 0
GossipStage                       0         0              0         0                 0
CacheCleanupExecutor              0         0              0         0                 0
AntiEntropyStage                  0         0              0         0                 0
MigrationStage                    0         0             11         0                 0
ValidationExecutor                0         0              0         0                 0
CommitLogArchiver                 0         0              0         0                 0
MiscStage                         0         0              0         0                 0
MemtableFlushWriter               0         0            126         0                 0
MemtableReclaimMemory             0         0            126         0                 0
PendingRangeCalculator            0         0              1         0                 0
MemtablePostFlush                 0         0           1468         0                 0
CompactionExecutor                0         0            254         0                 0
InternalResponseStage             0         0              1         0                 0
HintedHandoff                     0         0              0   

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
PAGED_RANGE                  0
BINARY                       0
READ                         0
MUTATION                   180
_TRACE                       0
REQUEST_RESPONSE             0
COUNTER_MUTATION             0