nodetool repair

The repair command repairs one or more nodes in a cluster, and provides options for restricting repair to a set of nodes, see Repairing nodes. Performing an anti-entropy node repair on a regular basis is important, especially in an environment that deletes data frequently.

Ensure that all involved replicas are up and accessible before running a repair. If repair encounters a down replica, an error occurs and the process halts. Re-run repair after bringing all replicas back online.

Control how the repair runs:

  • Number of nodes performing a repair:

    • Parallel runs repair on all nodes with the same replica data at the same time. (Default behavior in DataStax Enterprise (DSE) 5.0 and later.)

    • Sequential (-seq, --sequential) runs repair on one node after another. (Default behavior in DSE 4.8 and earlier.)

    • Datacenter parallel (-dcpar, --dc-parallel) combines sequential and parallel by simultaneously running a sequential repair in all datacenters; a single node in each datacenter runs repair, one after another until the repair is complete.

  • Amount of data that is repaired:

    • Full repair (default) compares all replicas of the data stored on the node where the command runs and updates each replica to the newest version. Does not mark the data as repaired or unrepaired. Default for DSE 5.1.3 and later. To switch to incremental repairs, see Migrating to incremental repairs.

    • Full repair with partitioner range (-pr, --partitioner-range) repairs only the primary replicas of the data stored on the node where the command runs. Recommended for routine maintenance.

    • Incremental repair (-inc) splits the data into repaired and unrepaired SSTables, only repairs unrepaired data. Marks the data as repaired or unrepaired. Default behavior in DSE 5.1.0-5.1.2.

      Due to CASSANDRA-9143, DataStax recommends upgrading to DSE 5.1.3 (or later) and switching to full repairs, see Migrating to full repairs.

DSE changed the default behavior for nodetool repair as follows:

  • DSE 5.1.3 and later runs full repair by default. To perform an incremental repair on a node running DSE 5.1.3 specify:

    nodetool repair -inc
  • DSE 5.1.0-5.1.2 runs incremental repair. To perform a full repair on a node running DSE 5.1.0-5.1.2 specify:

    nodetool repair -full

Before using the node repair tool, be sure to have an understanding of how node repair works.

Synopsis

nodetool [connection_options] repair
     [(-dc specific_dc | --in-dc specific_dc)...]
     [(-dcpar | --dc-parallel)]
     [(-et end_token | --end-token end_token)]
     [(-full | --full)]
     [(-hosts specific_host | --in-hosts specific_host)...]
     [-inc]
     [(-j job_threads | --job-threads job_threads)]
     [(-local | --in-local-dc)]
     [(-pr | --partitioner-range)]
     [(-pl | --pull)]
     [(-seq | --sequential)]
     [(-st start_token | --start-token start_token)]
     [(-tr | --trace)]
     [--]
     [keyspace tables...]

Tarball and Installer No-Services path:

<installation_location>/resources/cassandra/bin

Connection options

Connection options specify how to connect and authenticate for all nodetool commands:

Connection options
Short Long Description

-h

--host

Hostname or IP address.

-p

--port

Port number.

-pwf

--password-file

Password file path.

-pw

--password

Password.

-u

--username

Username.

--

Separates command parameters from a list of options.

  • If a username and password for RMI authentication are set explicitly in the cassandra-env.sh file for the host, then you must specify credentials.

  • The repair and rebuild commands can affect multiple nodes in the cluster.

  • Most nodetool commands operate on a single node in the cluster if -h is not used to identify one or more other nodes. If the node from which you issue the command is the intended target, you do not need the -h option to identify the target; otherwise, for remote invocation, identify the target node, or nodes, using -h.

Example:

nodetool -u username -pw password describering demo_keyspace

Repair options

Repair specific options. See Manual repair: Anti-entropy repair provides guidance on setting some of the following options.

-dc dc_name, --in-dc dc_name

Repair nodes in the named datacenter (dc_name). Datacenter names are case sensitive.

-dcpar, --dc-parallel

Runs a datacenter parallel repair, which combines sequential and parallel by simultaneously running a sequential repair in all datacenters; a single node in each datacenter runs repair, one after another until the repair is complete.

-et end_token, --end-token end_token

Token UUID. Repair a range of nodes starting with the first token (see -st) and ending with this token (end_token). Use -hosts to specify neighbor nodes.

-full, --full

Runs a full repair, which compares all replicas of the data stored on the node where the command runs and updates each replica to the newest version. Does not mark the data as repaired or unrepaired. Default for DSE 5.1.3 and later. To switch to incremental repairs, see Migrating to incremental repairs.

Option is only available on DSE 5.1.0-5.1.2, which by default runs incremental repairs. DataStax recommends upgrading to DSE 5.1.3 or later.

-hosts specific_host, --in-hosts specific_host

Repair specific hosts.

-inc

(Not recommended.) Runs an incremental repair, which persists already repaired data and calculates only the Merkle trees for SSTables that have not been repaired. Requires repairs to be run frequently (daily). Before running an incremental repair for the first time, perform migration steps first. Never run an incremental repair to restore a node or after bringing a downed node back online.

This parameter is only available in DSE 5.1.3 and later.DataStax recommends migrating to full repairs, see Changing repair strategies.

-j job_threads, --job-threads job_threads

Number of threads (job_threads) to run repair jobs. Usually the number of tables to repair concurrently. Be aware that increasing this setting puts more load on repairing nodes. (Default: 1, maximum: 4)

-local, --in-local-dc

Use to only repair nodes in the same datacenter.

-pr, --partitioner-range

Repair only the primary partition ranges of the node. To avoid re-repairing each range RF times, DataStax recommends using this option during routine maintenance (nodetool repair -pr) or using the OpsCenter Repair Service.

Not recommend with incremental repair because incremental repairs marks data as repaired during each step and does not re-repair the same data multiple times.

-pl, --pull

Performs a one-way repair where data is streamed from a remote node to this node.

-seq, --sequential

Runs a sequential repair, which runs repair on one node after another. (Default behavior in DSE 4.8 and earlier.)

-st start_token, --start-token start_token

Specify the token (start_token) at which the repair range starts.

-tr, --trace

Trace the repair. Traces are logged to system_traces.events.

keyspace_name table_list

Name of keyspace and space separated list of tables.

--

Separates an option from an argument that could be mistaken for an option.

Example

All nodetool repair arguments are optional.

To do a sequential repair of all keyspaces on the current node:

nodetool repair -seq

To do a partitioner range repair of the bad partition on current node using the good partitions on 10.2.2.20 or 10.2.2.21:

nodetool repair -pr -hosts 10.2.2.20 10.2.2.21

For a start-point-to-end-point repair of all nodes between two nodes on the ring:

nodetool repair -st -9223372036854775808 -et -3074457345618258603

To restrict the repair to the local datacenter, use the -dc option followed by the name of the datacenter. Issue the command from a node in the datacenter you want to repair. Issuing the command from a datacenter other than the named one returns an error. Do not use -pr with this option to repair only a local data center.

nodetool repair -dc DC1

Results in output:

[2014-07-24 21:59:55,326] Nothing to repair for keyspace 'system'
[2014-07-24 21:59:55,617] Starting repair command #2, repairing 490 ranges
  for keyspace system_traces (seq=true, full=true)
[2014-07-24 22:23:14,299] Repair session 323b9490-137e-11e4-88e3-c972e09793ca
  for range (820981369067266915,822627736366088177] finished
[2014-07-24 22:23:14,320] Repair session 38496a61-137e-11e4-88e3-c972e09793ca
  for range (2506042417712465541,2515941262699962473] finished
. . .

And an inspection of the system.log shows repair taking place only on IP addresses in DC1.

. . .
INFO  [AntiEntropyStage:1] 2014-07-24 22:23:10,708 RepairSession.java:171
  - [repair #16499ef0-1381-11e4-88e3-c972e09793ca] Received merkle tree
  for sessions from /192.168.2.101
INFO  [RepairJobTask:1] 2014-07-24 22:23:10,740 RepairJob.java:145
  - [repair #16499ef0-1381-11e4-88e3-c972e09793ca] requesting merkle trees
  for events (to [/192.168.2.103, /192.168.2.101])
. . .

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com