nodetool rebuild

Rebuilds data by streaming from other nodes.

Rebuilds data by streaming from other nodes.

This command operates on multiple nodes in a cluster and streams data only from a single source replica when rebuilding a token range. Use this command to add a new datacenter to an existing cluster.

Note: If nodetool rebuild is interrupted before completion, restart it by re-entering the command. The process resumes from the point at which it was interrupted.

Synopsis

nodetool [connection_options] rebuild 
[-c num_connections] [-dc src_dc_names] [-ks keyspace_name]
[-m mode] [-s source_ip_address] 
[-ts (start_token_1,end_token_1],(start_token_2,end_token_2], ...]
[-x exclude_source_IPs] [-xdc exclude_dc_names] [--] src-dc-name
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a valid option or user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key:value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

Definition

The short form and long form parameters are comma-separated.

Connection options

-h, --host hostname
The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.
-p, --port jmx_port
The JMX port number.
-pw, --password jmxpassword
The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.
-pwf, --password-file jmx_password_filepath
The filepath to the file that stores JMX authentication credentials.
-u, --username jmx_username
The user name for authenticating with secure JMX.

Command arguments

--
Separates an option from an argument that could be mistaken for a option.
-c, --connections-per-host num_connections
Maximum number of connections per host for streaming. Overrides value of streaming_connections_per_host in cassandra.yaml.
-dc src_dc_names, --dcs src_dc_names
Comma-separated list of datacenters from which to stream. Datacenter names are case sensitive. For example, dc-a,dc-b. To include a rack name, separate datacenter and rack name with a colon (:). For example, dc-a:rack1,dc-a:rack2.
-ks, --keyspace keyspace_name, ...
Comma-separated list of one or more keyspaces. List only the keyspaces to include in the rebuild.
Tip: Do not include keyspaces that are not replicated across datacenters (for example, dsefs keyspaces, and keyspaces with local strategy).
-m, --m mode
  • normal - conventional behavior, streams only ranges that are not already locally available
  • refetch - resets locally available ranges, streams all ranges but leaves current data untouched
  • reset - resets the locally available ranges, removes all locally present data (like a TRUNCATE), streams all ranges
  • reset-no-snapshot - (like reset) resets the locally available ranges, removes all locally present data (like a TRUNCATE), streams all ranges but prevents a snapshot even if auto_snapshot is enabled
When not specified, the default is normal.
-s, --sources source_ip_address
Comma-separated list of IP addresses from which to stream.
src-dc-name
  • datacenter - name of datacenter from which to select sources for streaming
  • when not set - the default is to pick any datacenter
-ts, --tokens (start_token_1,end_token_1], (start_token_2,end_token_2], ...
Comma-separated list of token ranges, in this format (start_token_1,end_token_1],(start_token_2,end_token_2],(start_token_n,end_token_n]
-x, --exclude-sources exclude_source_IPs
Comma-separated list of IP addresses to exclude from streaming.
-xdc, --exclude-dcs exclude_dc_name
Comma-separated list of datacenters to exclude from streaming. For example, dc-a,dc-b. To include a rack name in the list, separate datacenter and rack name with a colon (:). For example, dc-a:rack1,dc-a:rack2.

Examples

Rebuild DC2

nodetool rebuild DC2