The dsetool

Use the dsetool utility for Cassandra File System (CFS) and Hadoop-related tasks, such as managing the job tracker, checking the CFS, and listing node subranges of data in a keyspace.

You can use the dsetool utility for CassandraFS- and Hadoop-related tasks, such as managing the job tracker, checking the CassandraFS, and listing node subranges of data in a keyspace. In DataStax Enterprise 4.0.2 and later, only JMX ( java management extensions) provides dsetool password authentication. If JMX passwords are enabled, users then need to use the passwords to use the dsetool utility.

Usage: dsetool [-h|--host=<hostname>] [-p|--port=<#>] [-j|--jmxport=<#>] <command> <args>

This table describes the dsetool arguments:

Short form Long form Description
-a --jmx-username <arg> User name for authenticating with secure JMX
-b --jmx-password <arg> Password for authenticating with secure JMX
-h --host <arg> Node hostname or IP address
-j --jmxport <arg> Remote jmx agent port number
-u --use_hadoop_config Get cassandra host from hadoop configuration files

The dsetool commands are:

  • checkcfs - Check a single CassandraFS file or the whole CassandraFS.
  • createsystemkey <encryption option> [<encryption option ... >] [<system key name>] - Creates the system key for transparent data encryption. DataStax 4.0.4 and later.
  • inmemorystatus <keyspace> <table> - Provides the memory size, capacity, and percentage used by the table. The unit of measurement is MB. Bytes are truncated.
  • listjt - List all JobTracker nodes grouped by DC local to them.
  • list_subranges <keyspace> <cf-name> <keys_per_range> <start_token>, <end_token> - Divide a token range for a given keyspace/table into a number of smaller subranges of approximately keys_per_range. To be useful, the specified range should be contained by the target node's primary range.
  • jobtracker - Return the JobTracker hostname and port, JT local to the DC from which you are running the command.
  • movejt - Move the JobTracker and notify the TaskTracker nodes.
  • partitioner - Return the fully qualified classname of the IPartitioner in use by the cluster
  • repaircfs - Repair the CFS from orphan blocks.
  • rebuild_indexes <keyspace> <table-name> <idx1,idx2,...> - Rebuild specified secondary indexes for given keyspace/table. Use only keyspace/table-name to re-build all indexes.
  • ring - List the nodes in the ring including their node type.
  • status - Same as the ring command.

Examples of using dsetool commands for managing the Job Tracker are presented in Managing the job tracker using dsetool commands.

Checking the CassandraFS using dsetool 

Use the dsetool checkcfs command to scan the CassandraFS for corrupted files. For example:
dsetool checkcfs cfs:///
Use the dsetool to get details about a particular file that has been corrupted. For example:
dsetool checkcfs /tmp/myhadoop/mapred/system/jobtracker.info

Listing sub-ranges using dsetool 

The dsetool command syntax for listing subranges of data in a keyspace is:
dsetool [-h ] [hostname ] list_subranges keyspace table rows per subrange start token end token
  • rows per subrange is the approximate number of rows per subrange.
  • start partition range is the start range of the node.
  • end partition range is the end range of the node.
Note: You run nodetool repair on a single node using the output of list_subranges. The output must be partition ranges used on that node.
Example
dsetool list_subranges Keyspace1 Standard1 10000 113427455640312821154458202477256070485 0

Output

The output lists the subranges to use as input to the nodetool repair command. For example:
Start Token                             End Token                               Estimated Size
------------------------------------------------------------------------------------------------
113427455640312821154458202477256070485 132425442795624521227151664615147681247 11264
132425442795624521227151664615147681247 151409576048389227347257997936583470460 11136
151409576048389227347257997936583470460 0                                       11264

Nodetool repair command options

You need to use the nodetool utility when working with sub-ranges. The start partition range (-st) and end partition range (-et) options specify the portion of the node needing repair. You get values for the start and end tokens from the output of dsetool list_subranges command. The new nodetool repair syntax for using these options is:
nodetool repair keyspace table -st start token  -et end token
Example
nodetool repair Keyspace1 Standard1 -st 113427455640312821154458202477256070485 -et 132425442795624521227151664615147681247 
nodetool repair Keyspace1 Standard1 -st 132425442795624521227151664615147681247 -et 151409576048389227347257997936583470460
nodetool repair Keyspace1 Standard1 -st 151409576048389227347257997936583470460 -et 0

These commands begins an anti-entropy node repair from the start partition range to the end partition range.