Tuning Repair Service for multi-datacenter environments

When running the Repair Service on a multi-datacenter cluster, consider the number of total repair tasks and over-streaming.

Reduce the number of repair tasks

A single repair task is made up of at least six network requests between any two peers. Reducing the total number of repair tasks can drastically reduce network overhead and the time to complete a full Repair Service cycle. The number of repair tasks is controlled by how many partitions are targeted for each subrange. If there are more partitions in a subrange, each subrange is larger, which means fewer total subranges. The tokenranges_partitions property controls the targeted partition count.

Avoid over-streaming

Over-streaming occurs when a subrange is repaired that contains more partitions than the maximum merkle tree depth. This occurs if the tokenranges_partitions is set too high.

Guidelines for tuning

  • Never set tokenranges_partitions higher than the default 1048576, which is max-merkle-tree-depth of 220.

  • Test the tuning on the cluster prior to production. Look for the total number of repair tasks, average repair task time, and impact on cluster performance.

  • If single repair tasks take longer than 20-30 minutes and a full Repair Service cycle is within gc_grace_seconds, halve the tokenranges_partitions and re-test.

  • To check for over-streaming, ensure the following line does not exist in system.log:

    Range X with Y partitions require a merkle tree with depth Z but the maximum allowed depth for this range is 20.

    X, Y, and Z are variables.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com