nodetool repair

Repairs tables on one or more nodes in a cluster when all involved replicas are up and accessible.

All nodetool repair command options are optional. When optional command arguments are not specified, the defaults are:

  • Full repair runs on all keyspaces and all tables.

  • Repair runs in parallel on all nodes with the same replica data at the same time.

  • The number of job threads is 1.

  • No tracing.

  • No validation.

Tables with NodeSync enabled will be skipped for repair operations run against all or specific keyspaces. For individual tables, running the repair command will be rejected when NodeSync is enabled.

Tables with NodeSync enabled will be skipped for repair operations run against all or specific keyspaces. For individual tables, running the repair command will be rejected when NodeSync is enabled.

If repair encounters a down replica, an error occurs and the repair process halts. After bringing all replicas online, re-run nodetool repair.

Synopsis

nodetool [<connection_options>] repair
[-dcpar | -seq]
[-full | -inc]
[-hosts <ip_address>[,<ip_address>, ...]]
[-local | -dc <datacenter_name>[,<datacenter_name>,...]]
[-pl] [-pr] [-prv]
[-pull -hosts <local_ip_address> [<remote_ip_address>]
[-j <job_threads>]
[-st <start_token> -et <end_token>]
[-tr] [--]
[<keyspace_name> <table_name> [<table_name> ...]]
Syntax conventions Description

UPPERCASE

Literal keyword.

Lowercase

Not literal.

<`Italics>`

Variable value. Replace with a valid option or user-defined value.

[ ]

Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.

( )

Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.

|

Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.

...

Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.

'<Literal string>'

Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.

{ <key>:<value> }

Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.

<<datatype1>,<datatype2>>

Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.

cql_statement;

End CQL statement. A semicolon ( ; ) terminates all CQL statements.

[ -- ]

Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.

' <<schema> ... </schema> >'

Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.

@<xml_entity>='<xml_entity_type>'

Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

Definition

The short- and long-form options are comma-separated.

Connection options

-h, --host hostname

The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.

-p, --port jmx_port

The JMX port number.

-pw, --password jmxpassword

The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.

-pwf, --password-file jmx_password_filepath

The filepath to the file that stores JMX authentication credentials.

-u, --username jmx_username

The username for authenticating with secure JMX.

Command arguments

--

Separates an option from an argument that could be mistaken for an option.

-dc datacenter_name, --in-dc datacenter_name

Comma-separated list of datacenters to limit repairs to. Datacenter names are case sensitive. Decreases network traffic while repairing more nodes than the local option. When this option is not specified, repair is run cluster-wide on all nodes that contain replicas.

-dcpar, --dc-parallel

Run repairs on all nodes with the same replica data at the same time, recommended for repairs across datacenters. A single node in each datacenter runs repair, one after another until the repair is complete. This option combines sequential and parallel repair by simultaneously running a sequential repair in all datacenters. Use with the -local option only when the datacenter nodes have all the data for all ranges.

-et, --end-token end_token

The token at which the range ends. Requires start token (-st).

-force, --force

Filter out down endpoints.

-full, --full

Issue a full repair of all data ranges on the node where the command is issued and stream data to all nodes that have replicas for any of the token ranges held by the node where the command is run. A full repair keeps unrepaired and repaired data together. A full repair of all SSTables on a node takes a lot of time and is resource-intensive. Full repair is the default repair strategy.

-hosts, --in-hosts host_name

Repair specific hosts.

-inc, --inc

Issue an incremental repair.

Incremental repair splits the data into repaired and unrepaired SSTables and marks the repaired data with a RepairedAt timestamp. Incremental repair consumes less time and resources because it skips SSTables that are already marked as repaired. However, because unrepaired and repaired data is split, all subsequent repairs must also be run in incremental mode.

-j, --job-threads num_threads

Number of threads to run repair jobs. Usually this means number of tables to repair concurrently. Default: 1. Max: 4.

Increasing job threads puts more load on repairing nodes.
keyspace_name <table_name> [<table_name> …​]

The keyspace to repair followed by one or many tables. If not specified, all tables will be repaired.

-local, --in-local-dc

Repair only against nodes in the same datacenter.

-pl, --pull

Runs a one-way repair directly from another node that has a replica in the same token range. This option minimizes performance impact when cross-datacenter repairs are required.

-pr, --partitioner-range

Repair only the first range returned by the partitioner.

-prv, --preview

Determine ranges and amount of data to be streamed, but doesn’t perform repair.

-seq, --sequential

Perform a sequential repair operation. If not specified, then run parallel repair operations.

Parallel repair operations cause a significant impact on performance.

-st, --start-token start_token

The token at which the range starts. Requires end token (-et).

table_name

One or more table names, separated by a space.

-tr, --trace

Trace the repair. Traces are logged to system_traces.events.

-vd, --validate

Checks that repaired data is in sync between nodes.

Out of sync repaired data indicates a full repair should be run.

Examples

Run partitioner range repair

nodetool repair -pr

Run start-point-to-end-point repair of all nodes between two nodes on the ring

nodetool repair -st -9223372036854775808 -et -3074457345618258603

Restrict repair to local datacenter

nodetool repair -dc DC1

Results

Results in output for repairs:

[2019-12-18 18:40:16,449] Starting repair command #1 (d59dbd60-21c5-11ea-8e0e-cf718f0bee2a), repairing keyspace system_traces with repair options (parallelism: sequential, primary range: false, incremental: false, job threads: 1, ColumnFamilies: {}, dataCenters: {}, hosts: {}, previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false)
[2019-12-18 18:40:16,560] Repair session d5a49b30-21c5-11ea-8e0e-cf718f0bee2a for range [(-3074457345618258603,3074457345618258602]] finished (progress: 16%)
[2019-12-18 18:40:16,601] Repair session d5a69700-21c5-11ea-8e0e-cf718f0bee2a for range [(3074457345618258602,-9223372036854775808]] finished (progress: 33%)
[2019-12-18 18:40:16,603] Repair completed successfully
[2019-12-18 18:40:16,607] Repair command #1 finished in 0 seconds
...

Results in output for keyspaces that don’t require repairs:

[...
[2019-12-18 18:40:16,855] Replication factor is 1. No repair is needed for keyspace 'cycling'

The system.log shows repair runs only on IP addresses in DC1:

>. . .
                    INFO  [AntiEntropyStage:1] 2014-07-24 22:23:10,708 RepairSession.java:171
                    - [repair #16499ef0-1381-11e4-88e3-c972e09793ca] Received merkle tree
                    for sessions from /192.168.2.101
                    INFO  [RepairJobTask:1] 2014-07-24 22:23:10,740 RepairJob.java:145
                    - [repair #16499ef0-1381-11e4-88e3-c972e09793ca] requesting merkle trees
                    for events (to [/192.168.2.103, /192.168.2.101])
                    . . .

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com