Read Repair: repair during read path

Describes read repair, repair during read path.

Read repair is an important component of keeping data consistent in a Cassandra cluster, because every time a read request occurs, it provides an opportunity for consistency improvement. As a background process, read repair generally puts little strain on the cluster.

In read repair, Cassandra sends a digest request to each replica not directly involved in the read. Cassandra compares all replicas and writes the most recent version to any replica node that does not have it. If the query's consistency level is above ONE, Cassandra performs this process on all replica nodes in the foreground before the data is returned to the client. Read repair repairs any node queried by the read. This means that for a consistency level of ONE, no data is repaired because no comparison takes place. For QUORUM, only the nodes that the query touches are repaired, not all nodes.

The compaction strategy DateTieredCompactionStrategy precludes using read repair, because of the way timestamps are checked for DTCS compaction. In this case, you must set read_repair_chance to zero. For other compaction strategies, read repair should be enabled with a read_repair_chance value of 0.2 being typical.