Node repair

Node repair is the process that the database uses to make sure data in every replica is consistent with other replicas. Several different types of repair exist.

Over time, data in a replica can become inconsistent with other replicas due to the distributed nature of the database. Node repair corrects the inconsistencies so that all nodes have the same and most up-to-date data. Node repair is an important part of regular maintenance for every DataStax Enterprise (DSE) cluster. Use database settings or various DSE tools to configure each type of repair.

DataStax Enterprise provides the following repair processes. The following links detail when to use and how to configure each repair type.

  • NodeSync: Continuous background repair

    The NodeSync service has low overhead and provides consistent performance and virtually eliminates manual efforts to run repair operations in a DataStax cluster. The nodesync table property continuously verifies the data consistency across replicas in the background. NodeSync is enabled on a per table basis.

  • Hinted Handoff

    If a node becomes unable to receive a particular write, the write’s coordinator node preserves the data to be written as a set of hints. When the node comes back online, the coordinator hands off hints so that the node can catch up with the required writes.

  • Read Repair

    During the read path, a query assembles data from several nodes. The coordinator node for the read compares the data from each replica node. If any replica node has outdated data, the coordinator node sends it the most recent version. The scope of this type of repair depends on the keyspace’s replication factor. During a read, the database collects only enough replica data to satisfy the replication factor, and only performs read repair on nodes that participate in that read operation.

  • Anti-Entropy Repair

    DataStax Enterprise provides the nodetool repair tool to ensure data consistency across replicas; it compares the data across all replicas and then updates the data to the most recent version. Use nodetool repair as part of your regular maintenance routine.

    DataStax recommends stopping repair operations during topology changes; the Repair Service does this automatically. Repairs running during a topology change are likely to error when it involves moving ranges.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com