Fast repair

Repair subranges of data in a cluster instead of running a nodetool repair operation on entire ranges.

Repairing subranges of data in a cluster is faster than running a nodetool repair operation on entire ranges because all the data replicated during the nodetool repair operation has to be re-indexed. When you repair a subrange of the data, less data has to be re-indexed.

To repair a subrange

Perform these steps as a rolling repair of the cluster, one node at a time.

  1. Run the dsetool list_subranges command, using the approximate number of rows per subrange, the beginning of the partition range (token), and the end of the partition range of the node.
    dsetool list_subranges my_keyspace my_table 10000 113427455640312821154458202477256070485 0
    The output lists the subranges.
    Start Token                             End Token                               Estimated Size
    113427455640312821154458202477256070485 132425442795624521227151664615147681247 11264
    132425442795624521227151664615147681247 151409576048389227347257997936583470460 11136
    151409576048389227347257997936583470460 0                                       11264
  2. Use the output of the previous step as input to the nodetool repair command.
    nodetool repair my_keyspace my_table -st 113427455640312821154458202477256070485
      -et 132425442795624521227151664615147681247
    nodetool repair my_keyspace my_table -st 132425442795624521227151664615147681247
      -et 151409576048389227347257997936583470460
    nodetool repair my_keyspace my_table -st 151409576048389227347257997936583470460
      -et 0

    The anti-entropy node repair runs from the start to the end of the partition range.