The nodetool utility

A command line interface for Cassandra for managing a cluster.

The nodetool utility is a command line interface for Cassandra for managing a cluster.

Command format 

  • Packaged installs: nodetool -h HOSTNAME [-p JMX_PORT ] COMMAND
  • Tarball installs: <install_location>/bin/nodetool -h HOSTNAME [-p JMX_PORT ] COMMAND
  • Remote Method Invocation: nodetool -h HOSTNAME [-p JMX_PORT -u JMX_USERNAME -pw JMX_PASSWORD ] COMMAND

    If a username and password for RMI authentication are set explicitly in the cassandra-env.sh file for the host, then you must specify credentials:

Options 

Flag Option Description
-a --include-all-sstables Rewrite/upgrade all SSTables including the most recent when rebuilding SSTables.
-et --end-token arg Token at which repair range ends.
-h --host arg Hostname of node or IP address.
-local --in-local-dc Only repair against nodes in the same data center.
-p --port arg Remote JMX agent port number.
-pr --partitioner-range Repair only the first range returned by the partitioner for the node.
-pw --password arg Remote JMX agent password.
-st --start-token arg Token at which repair range starts.
-T --tokens Display all tokens.
-u --username arg Remote JMX agent username.
Snapshot options only
-cf --column-family arg Only take a snapshot of the specified table.
-snapshot --with-snapshot Repair one node at a time using snapshots.
-t --tag arg Optional name to give a snapshot.

Commands 

Square brackets indicate optional parameters.

cfhistograms keyspace table
Displays statistics on the read/write latency for a table. These statistics, which include row size, column count, and bucket offsets, can be useful for monitoring activity in a table.
cfstats
Displays statistics for every keyspace and table.
cleanup [keyspace][table]
Triggers the immediate cleanup of keys no longer belonging to this node. This has roughly the same effect on a node that a major compaction does in terms of a temporary increase in disk space usage and an increase in disk I/O. Optionally takes a list of table names.
clearsnapshot [keyspaces...] -t [snapshotName]
Deletes snapshots for the specified keyspaces. You can remove all snapshots or remove the snapshots with the given name.
compact [keyspace][table]
For tables that use the SizeTieredCompactionStrategy, initiates an immediate major compaction of all tables in keyspace. For each table in keyspace, this compacts all existing SSTables into a single SSTable. This can cause considerable disk I/O and can temporarily cause up to twice as much disk space to be used. Optionally takes a list of table names.
compactionstats
Displays compaction statistics.
decommission
Tells a live node to decommission itself (streaming its data to the next node on the ring). Use netstats to monitor the progress. Also see http://wiki.apache.org/cassandra/Operations#Removing_nodes_entirely.
describering keyspace
Shows the partition ranges (formerly token ranges) for a given keyspace.
disablebackup
Disable incremental backup.
disablebinary
Disable native transport (binary protocol).
disablegossip
Disable Gossip. Effectively marks the node dead.
disablehandoff
Disable storing of future hints on the current node.
disablethrift
Disable the Thrift server.
drain
Flushes all memtables from the node to SSTables on disk. Cassandra stops listening for connections from the client and other nodes. You need to restart Cassandra after running nodetool drain. You typically use this command before upgrading a node to a new version of Cassandra. To simply flush memtables to disk, use nodetool flush.
enablebackup
Enable incremental backup.
enablebinary
Re-enable native transport (binary protocol).
enablegossip
Re-enables Gossip.
enablehandoff
Re-enable storing future hints on the current node.
enablethrift
Re-enable the Thrift server.
flush [keyspace] [table]
Flushes all memtables for a keyspace to disk, allowing the commit log to be cleared. Optionally takes a list of table names.
getcompactionthreshold keyspace table
Gets the current compaction threshold settings for a table. See http://wiki.apache.org/cassandra/MemtableSSTable.
getendpoints keyspace table key
Displays the end points that owns the key. The key is only accepted in HEX format.
getsstables keyspace table key
Displays the sstable filenames that own the key.
gossipinfo
Shows the gossip information for the cluster.
info [-T or --tokens]
Outputs node information including the token, load info (on disk storage), generation number (times started), uptime in seconds, and heap memory usage.
invalidatekeycache [keyspace] [tables]
Invalidates, or deletes, the key cache. Optionally takes a keyspace or list of table names. Leave a blank space between each table name.
invalidaterowcache [keyspace] [tables]
Invalidates, or deletes, the row cache. Optionally takes a keyspace or list of table names. Leave a blank space between each table name.
join
Causes the node to join the ring. This assumes that the node was initially not started in the ring, that is, started with -Djoin_ring=false. Note that the joining node should be properly configured with the desired options for seed list, initial token, and auto-bootstrapping.
move new_token
Moves a node to a new token. This essentially combines decommission and bootstrap. See http://wiki.apache.org/cassandra/Operations#Moving_nodes.
netstats host
Displays network information such as the status of data streaming operations (bootstrap, repair, move, and decommission) as well as the number of active, pending, and completed commands and responses.
pausehandoff
Pause the hints delivery process.
predictconsistency replication_factor time [versions] [latency_percentile]
Predict the latency and consistency "t" milliseconds after writes.
proxyhistograms
Print statistic histograms for network operations.
rangekeysample
Displays the sampled keys held across all keyspaces.
rebuild [source_dc_name]
Operates on multiple nodes in a cluster. Rebuilds data by streaming from other nodes (similar to bootstrap). Use this command to bring up a new data center in an existing cluster. For example, when adding a new data center, you would run the following on all nodes in the new data center:
nodetool rebuild -- name_of_existing_data_center
Attention: If you don't specify the existing data center in the command line, the new nodes will appear to rebuild successfully, but will not contain any data.

See Adding a data center to a cluster.

rebuild_index keyspace table_name.index_name,index_name1
Fully rebuilds the native index for a given table. Example of index_names : Standard3.IdxName, Standard3.IdxName1.
refresh keyspace [table]
Loads newly placed SSTables on to the system without restart.
removenode force | status Host ID
Remove node by host ID, or force completion of pending removal. Show status of current node removal. To get host ID, run status.
repair keyspace [table] [-pr]
Operates on multiple nodes in a cluster. Begins an anti-entropy node repair operation. If the -pr option is specified, only the first range returned by the partitioner for a node is repaired. This allows you to repair each node in the cluster in succession without duplicating work. Without -pr , all replica ranges that the node is responsible for are repaired. Optionally takes a list of table names.
resetlocalschema
Reset the node's local schema and resync.
resumehandoff
Resume the hints delivery process.
ring
Displays node status and information about the ring as determined by the node being queried. This can give you an idea of the load balance and if any nodes are down. If your cluster is not properly configured, different nodes may show a different ring; this is a good way to check that every node views the ring the same way.
  • Address

    The node's URL.

  • DC (data center)

    The data center containing the node.

  • Rack

    The rack or, in the case of Amazon EC2, the availability zone of the node.

  • Status - U (up) or D (down)

    Indicates whether the node is functioning or not.

  • State - N (normal), L (leaving), J (joining), M (moving)

    The state of the node in relation to the cluster.

  • Load - updates every 90 seconds

    The amount of file system data under the cassandra data directory after excluding all content in the snapshots subdirectories. Because all SSTable data files are included, any data that is not cleaned up, such as TTL-expired cell or tombstoned data) is counted.

  • Token

    The end of the token range up to and including the value listed. For an explanation of token ranges, see Data Distribution in the Ring.

  • Owns

    The percentage of the data owned by the node per data center times the replication factor. For example, a node can own 33% of the ring, but show100% if the replication factor is 3.

  • Host ID

    The network ID of the node.

Note: If you are using virtual nodes (vnodes), use nodetool status ; it is much less verbose.
scrub [keyspace][table]
Rebuilds SSTables on a node for the named tables and snapshots data files before rebuilding as a safety measure. If possible use upgradesstables. While scrub rebuilds SSTables, it also discards data that it deems broken and creates a snapshot, which you have to remove manually. If scrub can't validate the column value against the column definition's data type, it logs the row key and skips to the next row.
setcachecapacity key-cache-capacity row-cache-capacity
Set the global key and row cache capacities in megabytes.
setcompactionthroughput value_in_mb
Set the maximum throughput for compaction in the system in megabytes per second. To disable throttling, set to 0.
setcompactionthreshold keyspace table min_threshold max_threshold
Set minimum and maximum compaction thresholds for a table. This parameter controls how many SSTables of a similar size must be present before a minor compaction is scheduled. The max_threshold property sets an upper bound on the number of SSTables that may be compacted in a single minor compaction. Also see http://wiki.apache.org/cassandra/MemtableSSTable.
setstreamthroughput value_in_mb
Set the maximum streaming throughput in the system in megabytes per second. To disable throttling, set to 0.
settraceprobability value
Probabilistic tracing is useful to determine the cause of intermittent query performance problems by identifying which queries are responsible. This option traces some or all statements sent to a cluster. Tracing a request usually requires at least 10 rows to be inserted.

A probability of 1.0 will trace everything whereas lesser amounts (for example, 0.10) only sample a certain percentage of statements. Care should be taken on large and active systems, as system-wide tracing will have a performance impact. Unless you are under very light load, tracing all requests (probability 1.0) will probably overwhelm your system. Start with a small fraction, for example, 0.001 and increase only if necessary. The trace information is stored in a systems_traces keyspace that holds two tables – sessions and events, which can be easily queried to answer questions, such as what the most time-consuming query has been since a trace was started. Query the parameters map and thread column in the system_traces.sessions and events tables for probabilistic tracing information.

snapshot [keyspaces...] -cf [tableName] -t [snapshotName]
Takes an online snapshot of Cassandra’s data. You can specify the table for particular keyspaces using the snapshotName option. Before taking the snapshot, the node is flushed. The results are stored in Cassandra’s data directory under the snapshots directory of each keyspace. See Install locations and http://wiki.apache.org/cassandra/Operations#Backing_up_data.
status
Display cluster information:
  • Status - U (up) or D (down)

    Indicates whether the node is functioning or not.

  • State - N (normal), L (leaving), J (joining), M (moving)

    The state of the node in relation to the cluster.

  • Address

    The node's URL.

  • Load - updates every 90 seconds

    The amount of file system data under the cassandra data directory after excluding all content in the snapshots subdirectories. Because all SSTable data files are included, any data that is not cleaned up, such as TTL-expired cell or tombstoned data) is counted.

  • Tokens

    The number of tokens set for the node.

  • Owns

    The percentage of the data owned by the node per data center times the replication factor. For example, a node can own 33% of the ring, but show100% if the replication factor is 3.

  • Host ID

    The network ID of the node.

  • Rack

    The rack or, in the case of Amazon EC2, the availability zone of the node.

statusbinary
Status of native transport (binary protocol).
statusthrift
Status of the thrift server.
stop [operation type]
Stops an operation from continuing to run. Options are COMPACTION, VALIDATION, CLEANUP, SCRUB, INDEX_BUILD. For example, this allows you to stop a compaction that has a negative impact on the performance of a node. After the compaction stops, Cassandra continues with the rest in the queue. Eventually, Cassandra restarts the compaction.
tpstats
Displays the number of active, pending, and completed tasks for each of the thread pools that Cassandra uses for stages of operations. A high number of pending tasks for any pool can indicate performance problems. See: http://wiki.apache.org/cassandra/Operations#Monitoring and Table 2.
upgradesstables [-a] [keyspace] [tables]

Rebuilds SSTables on a node for the named tables that are not on the current version. Use when upgrading your server or changing compression options (available from Cassandra 1.0.4 and later).

Use -a to include all SSTables, even those already on the current version.

version
Displays the Cassandra release version for the node being queried.