nodetool tpstats
Outputs thread pool usage statistics.
A DataStax Enterprise (DSE) database is based on a Staged Event Driven Architecture (SEDA). The database separates different tasks into stages connected by a messaging service. Each stage has a queue and a thread pool. Some stages skip the messaging service and queue tasks immediately on a different stage when it exists on the same node. If the next stage is too busy, the database can back up a queue and lead to performance bottlenecks. For more information, see Monitor DataStax Enterprise (DSE) clusters.
Reports are updated after SSTables change through compaction or flushing.
Report columns
The nodetool tpstats command report includes the following columns:
- Active
-
The number of
Activethreads. - Pending
-
The number of
Pendingrequests waiting to be served by the thread pool.If this value increases or stagnates at a large number, you might need to scale up the cluster’s capacity.
After observing the cluster long enough to establish a normal baseline, configure alerts for any increases above normal in the pending tasks column.
- Backpressure
-
The number of requests that are
Pendingbecause the thread pool is full. HighBackpressurevalues indicate that the node is struggling to keep up with the incoming workload and needs to signal others nodes to slow down the rate of incoming requests. - Delayed
-
The number of requests that are
Delayedbecause the thread pool is full. - Shared
-
The number of tasks that are
Sharedbetween threads in the thread pool. - Stolen
-
The number of tasks that this thread pool has taken from other thread pools.
- Completed
-
The number of tasks
Completedby this thread pool. - Blocked
-
The number of requests that are currently
Blockedbecause the thread pool for the next step in the service is full. - All-Time Blocked
-
The total number of
All-Time Blockedrequests, which are all requests blocked in this thread pool up to now.
Report rows
For each task in the nodetool tpstats output, the report also includes aggregated statistics for the associated subtasks and properties on the node:
- AntiEntropyStage
-
Processing repair messages and streaming. See
nodetool repair. - BackgroundIoStage
-
Completes background tasks like submitting hints and deserializing the row cache.
- CacheCleanupExecutor
-
Clearing the cache.
- CommitlogArchiver
-
Copying or archiving commitlog files for recovery.
- CompactionExecutor
-
Running compaction.
- CounterMutationStage
-
Processing local counter changes. Backs up if the write rate exceeds the mutation rate.
A high pending count is seen if consistency level is set to
ONEand there is a high counter increment workload. - GossipStage
-
Distributing node information through Gossip. Out of sync schemas can cause issues. You might need to sync using
nodetool resetlocalschema. - HintedHandoff
-
Sending missed mutations to other nodes. Usually symptom of a problem elsewhere. Use
nodetool disablehandoffand run repair. - HintsDispatcher
-
Dispatches a single hints file to a specified node in a batched manner.
- InternalResponseStage
-
Responding to non-client initiated messages, including bootstrapping and schema checking.
- MemtableFlushWriter
-
Writing memtable contents to disk. Might back up if the queue overruns the disk I/O or due to sorting processes.
nodetool tpstatsno longer reports blocked threads in theMemtableFlushWriterpool. Check the Pending Flushes metric reported bynodetool tablestats. - MemtablePostFlush
-
Cleaning up after flushing the memtable (discarding commit logs and secondary indexes as needed).
- MemtableReclaimMemory
-
Making unused memory available.
- MigrationStage
-
Processing schema changes.
- MiscStage
-
Snapshotting, replicating data after node remove completed.
- MutationStage
-
Performing local inserts/updates, schema merges, commit log replays or hints in progress.
A high number of
Pendingwrite requests indicates the node is having a problem handling them. Fix this by adding a node, tuning hardware and configuration, and/or updating data models. - Native-Transport-Requests
-
Processing CQL requests to the server.
- PendingRangeCalculator
-
Calculating pending ranges per bootstraps and departed nodes. Reporting by this tool isn’t considered useful.
- PerDiskMemtableFlushWriter_N
-
Activity for the memtable flush writer of each disk.
- ReadRepairStage
-
Performing read repairs. Usually fast, if there is good connectivity between replicas.
If
Pendinggrows too large, attempt to lower the rate for high-read tables by altering the table to use a smallerread_repair_chancevalue, like 0.11. - ReadStage
-
Performing local reads. Also includes deserializing data from row cache.
Pending values can cause increased read latency. Generally resolved by adding nodes or tuning the system.
- RequestResponseStage
-
Handling responses from other nodes.
- ValidationExecutor
-
Validating schema.
Droppable messages
The database generates the messages listed below, but discards them after a timeout.
The nodetool tpstats command reports the number of messages of each type that have been dropped.
You can view the messages themselves using a JMX client.
| Message Type | Stage | Notes |
|---|---|---|
|
n/a |
Deprecated |
|
n/a (special) |
Used for recording traces (nodetool settraceprobability) Has a special executor (1 thread, 1000 queue depth) that throws away messages on insertion instead of within the execute |
|
If a write message is processed after its timeout ( |
|
|
If a write message is processed after its timeout ( |
|
|
Times out after |
|
|
Times out after |
|
|
Times out after |
|
|
Times out after |
|
|
Times out after |
Synopsis
nodetool <options> tpstats
Tarball and Installer No-Services path:
<installation_location>/resources/cassandra/bin
| Short | Long | Description |
|---|---|---|
|
|
Hostname or IP address.
+
If running To get thread pool statistics for a remote node, you must provide the appropriate connection options, including credentials for JMX authentication. |
|
|
Output format |
|
|
Port number. |
|
|
Password file path. |
|
|
Password. |
|
|
Remote JMX agent username. |
|
Separates an option from an argument that could be mistaken for an option. |
|
Example
Running nodetool tpstats:
nodetool tpstats
Example output is:
Pool Name Active Pending Completed Blocked All time blocked
ReadStage 0 0 7 0 0
ContinuousPagingStage 0 0 0 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 76 0 0
MutationStage 0 0 2 0 0
GossipStage 0 0 0 0 0
RequestResponseStage 0 0 0 0 0
ReadRepairStage 0 0 0 0 0
CounterMutationStage 0 0 0 0 0
MemtablePostFlush 0 0 182 0 0
ValidationExecutor 0 0 0 0 0
MemtableFlushWriter 0 0 52 0 0
ViewMutationStage 0 0 0 0 0
CacheCleanupExecutor 0 0 0 0 0
MemtableReclaimMemory 0 0 52 0 0
PendingRangeCalculator 0 0 1 0 0
AntiCompactionExecutor 0 0 0 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 0 0 0
Native-Transport-Requests 0 0 0 0 0
MigrationStage 0 0 12 0 0
PerDiskMemtableFlushWriter_0 0 0 51 0 0
Sampler 0 0 0 0 0
InternalResponseStage 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
Message type Dropped Latency waiting in queue (micros)
50% 95% 99% Max
READ 0 N/A N/A N/A N/A
RANGE_SLICE 0 N/A N/A N/A N/A
_TRACE 0 N/A N/A N/A N/A
HINT 0 N/A N/A N/A N/A
MUTATION 0 N/A N/A N/A N/A
COUNTER_MUTATION 0 N/A N/A N/A N/A
BATCH_STORE 0 N/A N/A N/A N/A
BATCH_REMOVE 0 N/A N/A N/A N/A
REQUEST_RESPONSE 0 N/A N/A N/A N/A
PAGED_RANGE 0 N/A N/A N/A N/A
READ_REPAIR 0 N/A N/A N/A N/A