nodetool repair

Repairs one or more tables.

Repairs one or more tables.

Synopsis

nodetool [(-h <host> | --host <host>)] [(-p <port> | --port <port>)]
     [(-pw <password> | --password <password>)]
     [(-pwf <passwordFilePath> | --password-file <passwordFilePath>)]
     [(-u <username> | --username <username>)] repair
     [(-dc <specific_dc> | --in-dc <specific_dc>)...]
     [(-dcpar | --dc-parallel)] [(-et <end_token> | --end-token <end_token>)]
     [(-hosts <specific_host> | --in-hosts <specific_host>)...]
     [(-inc | --incremental)] [(-local | --in-local-dc)]
     [(-par | --parallel)] [(-pr | --partitioner-range)]
     [(-st <start_token> | --start-token <start_token>)] [--] [<keyspace>
     <cfnames>...]
Table 1. Options
Short Long Description
-h --host Hostname or IP address
-p --port Port number
-pwf --password-file Password file path
-pw --password Password
-u --username User name
-- Separates an option from an argument that could be mistaken for a option.
Other options are:
  • -dc, or --in-dc, followed by dc_name restricts repair to nodes in the named datacenter, which must be the local datacenter.
  • -dcpar, or --dc-parallel repairs datacenters in parallel.
  • -et or --end-token repairs a subset of the node's data ending with this token. (Also specify --start-token.)
  • -hosts host_name or --in-hosts host_name repairs specific hosts.
  • -inc or --incremental performs an incremental repair.
  • -local or --in-local-dc repairs nodes only in the local datacenter.
  • -par or --parallel performs repairs in parallel.
  • -pr or --partitioner-range repair only the first range returned by the partitioner.
  • -st or --start-token repairs a subset of the node's data starting with this token. (Also specify --end-token.)
  • keyspace is the keyspace name. The default is all.
  • table is a tablename or a space-delimited list of table names. If no tables are listed, the tool operates on all tables.

Description

Performing an anti-entropy node repair on a regular basis is important, especially in an environment that deletes data frequently. The repair command repairs one or more nodes in a cluster, and provides options for restricting repair to a set of nodes. Anti-entropy node repair performs the following tasks:
  • Ensures that all data on a replica is consistent.
  • Repairs inconsistencies on a node that has been down.

By default, Cassandra 2.1 does a full, sequential repair.

Using options

Use options to do these other types of repair:
  • Use the -hosts option to list the good nodes to use for repairing the bad nodes. Use -h to name the bad nodes.

  • Use the -inc option for an incremental repair. An incremental repair persists already repaired SSTables and calculates the Merkle trees only for unrepaired SSTables. If you run repairs frequently, this repair process is more performant than the other types of repair even as datasets grow. Before doing an incremental repair for the first time, perform the Incremental repair migration steps.

  • Use the -par option for a parallel repair. Unlike sequential repair, parallel repair constructs the Merkle trees for all nodes at the same time. Therefore, no snapshots are required (or generated). Use a parallel repair to complete the repair quickly or when you have operational downtime that allows the resources to be completely consumed during the repair.

  • Use the -pr option to perform non-incremental partitioner range repairs across an entire cluster. Do not use this option for incremental repairs.

Examples

All nodetool repair arguments are optional. The following examples show the following types of repair:

# A full repair on a specific keyspace
$ nodetool repair <keyspace_name>

# An incremental, parallel repair of all keyspaces on the current node
$ nodetool repair -par -inc

# A partitioner range repair of the bad partition on current node using the good partitions on 10.2.2.20 or 10.2.2.21
$ nodetool repair -pr -hosts 10.2.2.20,10.2.2.21

# A start-point-to-end-point repair of all nodes between two nodes on the ring
$ nodetool repair -st -9223372036854775808 -et -3074457345618258603

To restrict the repair to the local datacenter, use the -dc option followed by the name of the datacenter. Issue the command from a node in the datacenter you specify in the command. If you issue the command from different datacenter, Cassandra returns an error. Do not use-dc and -pr together to repair only a local datacenter.

nodetool repair -dc DC1
[2014-07-24 21:59:55,326] Nothing to repair for keyspace 'system'
[2014-07-24 21:59:55,617] Starting repair command #2, repairing 490 ranges 
  for keyspace system_traces (seq=true, full=true)
[2014-07-24 22:23:14,299] Repair session 323b9490-137e-11e4-88e3-c972e09793ca 
  for range (820981369067266915,822627736366088177] finished
[2014-07-24 22:23:14,320] Repair session 38496a61-137e-11e4-88e3-c972e09793ca 
  for range (2506042417712465541,2515941262699962473] finished
. . .

An inspection of the system.log shows repair taking place only on IP addresses in DC1.

. . .
INFO  [AntiEntropyStage:1] 2014-07-24 22:23:10,708 RepairSession.java:171 
  - [repair #16499ef0-1381-11e4-88e3-c972e09793ca] Received merkle tree 
  for sessions from /192.168.2.101
INFO  [RepairJobTask:1] 2014-07-24 22:23:10,740 RepairJob.java:145 
  - [repair #16499ef0-1381-11e4-88e3-c972e09793ca] requesting merkle trees 
  for events (to [/192.168.2.103, /192.168.2.101])
. . .