dsetool utility

Use the dsetool utility for creating system keys, encrypting sensitive configuration information, and performing Cassandra File System (CFS) and Hadoop-related tasks, such as checking the CFS, and listing node subranges of data in a keyspace.

Use the dsetool utility for creating system keys, encrypting sensitive configuration, and performing Cassandra File System (CFS) and Hadoop-related tasks, such as checking the CFS, and listing node subranges of data in a keyspace.

Synopsis for dsetool connection options and command arguments 

Synopsis
$ dsetool [connection_options] command command_args
Legend
Syntax conventions Description
Italics Variable value. Replace with a user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
JMX authentication is supported for dsetool commands. For example:
dsetool --ssl -a jmx_username -b jmxpassword command
Internal authentication is supported for dsetool commands to authenticate with the user name and password of the configured Cassandra role. For example:
dsetool -l username -p password command
Note: You can provide authentication credentials in several ways, see Credentials for authentication.

To enable dsetool to use Kerberos authentication, see Enabling dsetool to use Kerberos.

This table describes the dsetool connection arguments that are supported for all dsetool commands:
Short form Long form Description
-f arg --config-file=arg Path to configuration file that stores credentials. The credentials in this configuration file override the ~/.dserc credentials.
-dfilepath --d filepath Key file output directory. Enables creating key files before DSE is installed. This option is typically used by IT automation tools like Ansible. When no directory is specified, keys are saved to the value of system_key_directory in dse.yaml.
-l arg --username=arg Role to authenticate with the configured Cassandra role.
-p arg --password=arg Password to authenticate with the configured Cassandra role.
-a arg --jmxusername=arg User name for authenticating with secure JMX.
-b arg --jmxpassword=arg Password for authenticating with secure JMX.
-c arg --cassandra_port=arg Cassandra port number.
-h arg --host=arg Node hostname or IP address.

You can provide authentication credentials in several ways, see Credentials for authentication. To enable dsetool to use Kerberos authentication, see Enabling dsetool to use Kerberos.

-j arg --jmxport=arg Remote JMX agent port number.
-s arg --port=arg arg - the Solr port number.
  --ssl Enable SSL encryption. See Setting up SSL for nodetool, dsetool, and dse advrep.
-u --use_hadoop_config Get Cassandra host from Hadoop configuration files.
This table describes optional dsetool arguments to configure SSL for native client connections:
Argument Description
--ssl Use SSL for native client connections.

--ssl is the same as --ssl=true.

--ssl-protocol=ssl_protocol ssl_protocol - the SSL protocol to use for connection to Cassandra when SSL is not enabled. For example, --ssl-protocol=ssl4.

--sslauth=true|false true - use SSL client authentication.

false - do not use SSL client authentication.

--cipher-suites=ssl_cipher_suites ssl_cipher_suites - a comma-separated list of SSL cipher suites for connection to Cassandra when SSL is enabled. For example, --cipher-suites=c1,c2,c3.
--keystore-password=ssl_keystore_password ssl_keystore_password - the keystore password for connection to Cassandra when SSL client authentication is enabled.
--keystore-path=ssl_keystore_path ssl_keystore_path - the path to the keystore for connection to Cassandra when SSL client authentication is enabled. For example, --keystore-path=/path/to/ks.
--keystore-type=ssl_keystore_type ssl_keystore_type - the keystore type for connection to Cassandra when SSL client authentication is enabled. For example, --keystore-type=jks1.
--truststore-password=ssl_truststore_password ssl_truststore_password - the truststore password to use for connection to Cassandra when SSL is enabled.
--truststore-path=ssl_truststore_path ssl_truststore_path - the path to the truststore to use for connection to Cassandra when SSL is enabled. For example, --truststore-path=/path/to/ts.
--truststore-type=ssl_truststore_type ssl_truststore_type - the truststore type for connection to Cassandra when SSL is enabled. For example, --truststore-type=jks2.

dsetool commands  

autojt (deprecated) 
This command is deprecated. Job Trackers are managed automatically.
checkcfs cfs:///|filepath| 
Checks a single Cassandra File System (CFS) file or the whole CFS using these options:
  • cfs:/// - Scan the entire Cassandra File System (CFS) for corrupted files
  • filepath - Get details about a particular file that has been corrupted.
See checking the CFS using dsetool.
core_indexing_status keyspace_name.table_name [--all] [--progress] 
Retrieves the dynamic indexing status (INDEXING, FINISHED, or FAILED) of the specified core or cores in a DSE Search node, and optionally displays the percent complete and an estimated completion time in milliseconds.
dsetool -h IP_address core_indexing_status core_name|--all
where:
  • IP_address is the IP address of the host that is output by the dsetool ring command.
    If you do not specify the IP address, the default is the local node. For example:
    dsetool core_indexing_status wiki.solr
    wiki.solr: INDEXING
  • core_name is the name of the search index

    or

    --all to retrieve the dynamic indexing status of all search cores.

  • --progress to display the percent complete and an estimated completion time in milliseconds.
See Checking the indexing status using dsetool.
create_core keyspace.table 
Supports Cassandra password authentication with [-l username -p password].
Creates the Solr core and optionally generates resources automatically. This command preserves the case of keyspace and table names. You must use the correct case for the keyspace and table names. The Solr core is created with the specified keyspace and table name and following options:
Option Settings Default Description
schema= filepath n/a Path of the schema file. Cannot be specified when generateResources=true.
solrconfig= filepath n/a Path of the solrconfig.xml file. Cannot be specified when generateResources=true.
distributed= true or false true
  • true - distributes and applies the operation to all nodes in the local DC.
  • false - applies the operation only to the node it was sent to. Works only when recovery=true.
deleteAll= true or false false
  • true - deletes the already existing index before reindexing; search results will return either no or partial data while the index is rebuilding.
  • false - does not delete the existing index, causing the reindex to happen in-place; search results will return partially incorrect results while the index is updating. Keep the current index (accepting reads) while you build the new one, then swap over to the new index after it's ready.
recovery= true or false false
  • true - if the Solr core is unable to load due to corrupted index, recovers it by deleting and recreating the index. The deleteAll flag is set based on the recovery flag unless deleteAll is specifically set.
  • false - no recovery.
reindex= true or false false Observed only on auto-core creation (generateResources=true); otherwise, always reindexes on core creation. Reindex works on a datacenter (DC) level. Run this command once per Solr enabled DC.
  • true - reindexes the data.
  • false - does not reindex the data.
generateResources= true or false false Cannot be used with schema= and solrconfig=.
coreOptions= n/a n/a Path to the YAML-formatted options file when generateResources=true. See Customizing automatic resource generation.
coreOptionsInline= options n/a Accepts the same options that can be specified in the YAML file that is specified with coreOptions. See Customizing automatic resource generation. Use the following syntax: key1:value1#key2:value2#.
Examples:
coreOptionsInline=include_columns:id,name,body#rt:true
You must remove any spaces when using double quotes:
coreOptionsInline="generate_docvalues_for_fields:'*'"

To reindex, specify reindex=true, deleteAll=false. Keeps the current index (accepting reads) while the new index is building.

To do a full reindex, specify reindex=true, deleteAll=true. Delete the index first, then build a new one.

createsystemkey cipher_algorithm[/mode/padding] secret_key_strength [file] [-k=kmip_groupname [-t kmip_template] [-n namespace]] 
Creates a global encryption key, called a system key, for SSTable encryption using the following options:
  • cipher_algorithm[/mode/padding] secret_key_strength - When Java Cryptography Extension (JCE) is installed, the cipher_algorithm options and acceptable secret_key_strength values for the algorithms are:
    cipher_algorithm secret_key_strength
    AES/CBC/PKCS5Padding 128, 192, or 256
    AES/ECB/PKCS5Padding 128, 192, or 256
    DES/CBC/PKCS5Padding 56
    DESede/CBC/PKCS5Padding 112 or 168
    Blowfish/CBC/PKCS5Padding 32-448
    RC2/CBC/PKCS5Padding 40-128

    Key strength is not required for HMAC algorithms.

  • file - Specify the name of the system key file to create. If no name is specified, the default system key file name is system_key. The default system key file name is not configurable.
  • -k=kmip_groupname - Use the KMIP connection information to create a remote system key for the KMIP key server group that is defined in the kmip_hosts section in the dse.yaml file. The following options are available only for the specified KMIP key server group:
    • -t kmip_template - Uses the specified KMIP server key template.
    • -n namespace - Specifies the namespace to create the system key with.
See Encryption/compression options and algorithm sub-options and Encrypting sensitive property values.
encryptconfigvalue 
Encrypts sensitive configuration information. This command takes no arguments and prompts for the value to encrypt.
get_core_config keyspace.table [current=true|false] 
Outputs the latest uploaded solrconfig.xml resource file for the specified core. If current is set to true, returns the current live solrconfig.
get_core_schema keyspace.table [current=true|false] 
Supports Cassandra password authentication with [-l username -p password].
Outputs the latest uploaded Solr schema. If current is set to true, returns the current live schema.
infer_solr_schema keyspace.table [coreOptions path_to_options_file] 
Supports Cassandra password authentication with [-l username -p password].
Automatically infers and proposes a schema that is based on the specified keyspace and table. Solr cores are not modified. The Solr schema is inferred with the specified coreOptions YAML file or specified on the command line with coreOptionsInline:
Option Settings Default Description
coreOptions n/a n/a Path to the YAML-formatted options file when generateResources=true. See Customizing automatic resource generation.
coreOptionsInline= options n/a Accepts the same options that can be specified in the YAML file that is specified with coreOptions. Use the following syntax: key1:value1#key2:value2#. See Customizing automatic resource generation.
Examples:
coreOptionsInline=include_columns:id,name,body#rt:true
You must remove any spaces when using double quotes:
coreOptionsInline="generate_docvalues_for_fields:'*'"
inmemorystatus [keyspace.table] 
Provides the memory size, capacity, and percentage for this node and the amount of memory each table is using. To get information about a single table, specify the keyspace and table. The unit of measurement is MB. Bytes are truncated.
jobtracker (deprecated) 
This command is deprecated. Use dse client-tool hadoop job-tracker-address instead.
list_index_files keyspace.table [--index directory] 
Lists all DSE Search index files for the specified Solr core on the local node with the following option:
  • --index directory - specifies the data directory that contains the Solr index files. When not specified, the default directory is inferred from the Solr core name.
The index file is encrypted only when the backing CQL table is encrypted and the Solr core uses EncryptedFSDirectoryFactory; otherwise, the index file is decrypted.
list_subranges keyspace.table keys_per_range start_token, end_token 
Divides a token range for a given keyspace/table into a number of smaller subranges of approximately keys_per_range. To be useful, the specified range should be contained by the target node's primary range. See Listing sub-ranges using dsetool.
listjt 
Lists all Job Tracker nodes grouped by the datacenter that is local to them.
managekmip subcommand kmip_groupname [command_arguments] 
Verifies communication with the specified KMIP key server and lists the KMIP encryption keys on that key server. The follow subcommands are supported:
list kmip_groupname [namespace=key_namespace] 
Lists the encryption keys on the specified KMIP host. You can optionally specify the namespace.
expirekey kmip_groupname key_id [datetime] 
Specifies an expiration date and time for the specified encryption key. After the specified datetime, no new data will be encrypted with the key. Data can be decrypted with the key after this expire date/time. If an expire date/time is not specified, the key is expired immediately.

Format of datetime is YYYY-MM-DD HH:MM:SS:T. For example, use 2016-04-13 20:05:00:0 to expire the encryption key at 8:05 p.m. on 13 April 2016.

revoke kmip_groupname key_id 
Revokes the specified encryption key. After a key is revoked, the key cannot be used to decrypt data.
destroy kmip_groupname key_id 
Destroys the specified encryption key. After a key is destroyed, the key cannot be used to decrypt data.
movejt (deprecated) 
Use setjt and setrjt instead.
node_health -h IP_address [-all] 
Retrieves a dynamic score between 0 and 1 that describes the health of a DataStax Enterprise node. If you do not specify the IP address, the default is the local DataStax Enterprise node. A higher score indicates better node health. Nodes that have a large number of dropped mutations and nodes that are just started have a lower health score.
dsetool -h IP_address node_health 
where IP_address is the IP address that is output by the dsetool ring command.
For example:
dsetool -h 200.192.10.11 node_health 
Node Health: 0.7
Specify -all to retrieve the node health scores for all nodes:
dsetool node_health -all
partitioner 
Returns the fully qualified classname of the IPartitioner that is in use by the cluster.
perf subcommand 
Modifies performance object settings as described in the subcommand section.
read_resource keyspace.table name=resfilename 
Supports Cassandra password authentication with [-l username -p password].
Reads the specified DSE Search resource file.
rebuild_indexes keyspace.table [idx1,idx2,...] 
Rebuilds specified secondary indexes for specified keyspace/table. To rebuild all indexes, do not specify indexes and use only rebuild_indexes keyspace.table.
reload_core keyspace.table [option ...] 
Supports Cassandra password authentication with [-l username -p password].
Reloads a Solr core with the specified keyspace and table name. This command preserves the case of keyspace and table names. You must use the correct case for the keyspace and table names. Reloads a core with the following options:
Option Settings Default Description
schema= filepath n/a Path of the schema file
solrconfig= filepath n/a Path of the solrconfig.xml file
distributed= true or false true
  • true - distributes and applies the reload operation to all nodes in the local DC.
  • false - applies the reload operation only to the node it was sent to.
reindex= true or false false Works on a datacenter level. Run once per Solr-enabled datacenter.
  • true - reindexes the data.
  • false - does not reindex the data.
deleteAll= true or false false
  • true - deletes the already existing index before reindexing; search results will return either no or partial data while the index is rebuilding.
  • false - does not delete the existing index, causing the reindex to happen in-place; search results will return partially incorrect results while the index is updating.
Note: To reload the core and prevent reindexing, accept the default values reindex=false and deleteAll=false.

During reindexing, a series of criteria routes sub-queries to the nodes most capable of handling them. See Shard routing for distributed queries.

repaircfs [file_system] 
Repairs the file system from orphan blocks, where file_system specifies a CFS file system. Scans the sblocks table and deletes the data blocks that are not referenced from the inode table. Orphan blocks cannot be distinguished from a block that is being written. Do not use this command when data is being written to CFS.
The default value is cfs:/.
Restriction:
  • If replication factor (RF) is a value other than 1, you must run nodetool repair before you run dsetool repaircfs.
  • Do not run analytics jobs while dsetool repaircfs is running.
ring 
Lists the nodes in the ring sorted by token, including their node type. Datacenters with heterogeneous workloads are noted.
sparkmaster (deprecated) 
Use the dse client-tool spark command instead.
sparkworker restart 
Manually restarts the Spark Worker on the selected node, without restarting the node.
status 
Lists the nodes in their ring, including the node type and node health. When the datacenter workloads are the same type, the workload type is listed. When the datacenter workloads are heterogeneous, the workload type is mixed. Similar to the output of ring command.
stop_core_reindex keyspace.table [timeout] 
Stops the search core reindexing for the specified keyspace and table on the node where the command is run. Optionally, specify a timeout in minutes so that the core waits to stop reindexing until the specified timeout is reached, then gracefully stops the indexing. The default timeout is 1 minute.
tieredtablestats [keyspace.table] [-v] 
Outputs tiered storage information, including SSTables, tiers, timestamps, and sizes. Provides information on every table that uses tiered storage.
dsetool tieredtablestats foo.bar -v
  • -v Output statistics for each SSTable, in addition to the tier summaries.
  • keyspace.table Outputs statistics only for the specified keyspace and table.
tsreload client|server 
Reloads the node's truststores without a restart. Specify client or server:
  • client - Reloads the client truststore that is used for encrypted client-to-node communications.
  • server - Reloads the server truststore that is used for encrypted node-to-node SSL (internode) communications.
unload_core keyspace.table [option ...] 
Supports Cassandra password authentication with [-l username -p password].
Removes a Solr core with the specified keyspace and table name. This command preserves the case of keyspace and table names. You must use the correct case for the keyspace and table names. Unloads a core with the following options:
Option Settings Default Description
deleteDataDir= true or false false If true, deletes index data and any other artifacts in the solr.data directory. It does not delete Cassandra data.
deleteResources= true or false false If true, deletes the resources associated with the Solr core. For example, solrconfig.xml and schema.xml.
distributed= true or false true If true, deletes Solr data and resources across the cluster, depending on the values of deleteDataDir and deleteResources.

The removal of the Solr secondary index from the Cassandra table schema is always distributed.

upgrade_index_files keyspace.table -h IP_address [-c cassandra_port] [--backup] [--workspace directory] [--index directory] 
The node that contains the encryption configuration must be running. The local node is offline. The user that runs this command must have read and write permissions to the directory that contains the index files. Upgrades all DSE Search index files for the specified Solr core on the local node with the following options:
  • -h IP_address - Required. Node hostname or IP address of the remote node that contains the encryption configuration that is used for index encryption. The remote node must be running.
  • -c cassandra_port - The Cassandra port on the remote node that contains the encryption configuration.
  • --backup - Preserves the index files from the current index as a backup after successful upgrade. When not specified, index files from the current index are deleted.
  • --workspace directory - Specifies the workspace directory for the upgrade process. The upgraded index is created in this directory. When --backup is specified, the preserved index file backup is moved here. When not specified, the default directory is the same directory that contains the Solr index files.
  • --index directory - Specifies the data directory that contains the Solr index files. When not specified, the default directory is inferred from the Solr core name.
Index files are encrypted only when the backing CQL table is encrypted and the Solr core uses EncryptedFSDirectoryFactory; otherwise, the index file is decrypted.
write_resource keyspace.table name=uploaded_name file=path_to_file_to_upload 
Supports Cassandra password authentication with [-l username -p password].
Uploads the specified DSE Search resource file.
$ dsetool write_resource keyspace.table name=ResourceFile.xml file=schemaFile.xml
You can specify a file name for the uploaded resource file and the path to the resource file to upload. For example, stopwords.txt.
$ dsetool write_resource keyspace.table name=ResourceFile.xml file=myPath1/myPath2/schemaFile.xml
Resource files are stored in the Cassandra database. To view the resources, use dsetool read_resource or use the Solr Admin interface. You can configure the maximum resource file size or disable resource upload. with the Solr resource upload limit option in dse.yaml.

Checking the CFS using dsetool 

Use the dsetool checkcfs command to scan the Cassandra File System (CFS) for corrupted files. For example:
dsetool checkcfs cfs:///
Use the dsetool checkcfs command to get details about a particular file that has been corrupted. For example:
dsetool checkcfs /tmp/myhadoop/mapred/system/jobtracker.info

Listing sub-ranges using dsetool 

The dsetool command syntax for listing subranges of data in a keyspace is:
dsetool [-h hostname ] list_subranges keyspace table rows_per_subrange start_token end_token
  • rows_per_subrange - The approximate number of rows per subrange.
  • start_partition_range - The start range of the node.
  • end_partition_range - The end range of the node.
Note: Run nodetool repair on a single node using the output of list_subranges. The output must be partition ranges that are used on that node.
Example
dsetool list_subranges Keyspace1 Standard1 10000 113427455640312821154458202477256070485 0

Output

The output lists the subranges to use as input to the nodetool repair command. For example:
Start Token                             End Token                               Estimated Size
------------------------------------------------------------------------------------------------
113427455640312821154458202477256070485 132425442795624521227151664615147681247 11264
132425442795624521227151664615147681247 151409576048389227347257997936583470460 11136
151409576048389227347257997936583470460 0                                       11264

Nodetool repair command options

You must use the nodetool utility to work with sub-ranges. The start partition range (-st) and end partition range (-et) options specify the portion of the node that needs repair. You get values for the start and end tokens from the output of dsetool list_subranges command. The nodetool repair syntax for using these options is:
nodetool repair keyspace table -st start_token -et end_token
Example
nodetool repair Keyspace1 Standard1 -st 113427455640312821154458202477256070485 -et 132425442795624521227151664615147681247 
$ nodetool repair Keyspace1 Standard1 -st 132425442795624521227151664615147681247 -et 151409576048389227347257997936583470460
$ nodetool repair Keyspace1 Standard1 -st 151409576048389227347257997936583470460 -et 0

These commands begins an anti-entropy node repair from the start partition range to the end partition range.

Performance object subcommands 

The dsetool perf command subcommands temporarily change the running parameters:

Subcommand name Possible values Description
clustersummary - enable|disable Toggle cluster summary statistics. See Collecting database summary diagnostics.
cqlslowlog - threshold

- enable|disable

Set the CQL slow log threshold as a percentile of the actual request times:
  • [0,1] is a percentile threshold
  • >1 is an absolute threshold in milliseconds
  • 1.0 logs no queries
  • 99.9 logs 0.1% of the slowest queries
  • 95.0 logs 5% of the slowest queries
  • 50.0 logs 50% of the slowest queries
  • 0.0 logs all queries

Toggle the CQL slow log.

See Collecting slow queries.
cqlsysteminfo - enable|disable Toggle CQL system information statistics. See Collecting system level diagnostics.
dbsummary - enable|disable Toggle database summary statistics. See Collecting database summary diagnostics.
histograms - enable|disable Toggle table histograms. See Collecting table histogram diagnostics.
resourcelatencytracking - enable|disable Toggle resource latency tracking. See Collecting system level diagnostics.
solrcachestats - enable|disable Toggle Solr cache statistics.
solrindexingerrorlog - enable|disable Toggle Solr indexing error log.
solrindexstats - enable|disable Toggle Solr index statistics.
solrlatencysnapshots - enable|disable Toggle Solr latency snapshots.
solrrequesthandlerstats - enable|disable Toggle Solr request handler statistics.
solrslowlog - threshold

- enable|disable

Set the Solr slow log threshold in milliseconds.

Toggle Solr slow sub-query log. See Collecting slow Solr queries.

solrupdatehandlerstats - enable|disable Toggle Solr update handler statistics.
userlatencytracking - enable|disable Toggle user latency tracking. See Collecting user activity diagnostics.
Note: Enabling or disabling with the performance object subcommands does not persist between reboots and is useful only for short-term diagnostics. To make these settings permanent you must change the dse.yaml options, see CQL Performance Service options.