Configuring data consistency

Consistency levels in Cassandra can be configured to manage availability versus data accuracy.

Consistency levels in Cassandra can be configured to manage availability versus data accuracy. You can configure consistency on a cluster, datacenter, or individual I/O operation basis. Consistency among participating nodes can be set globally and also controlled on a per-operation basis (for example insert or update) using Cassandra’s drivers and client libraries.

Write consistency levels

This table describes the write consistency levels in strongest-to-weakest order.

Write Consistency Levels
Level Description Usage
ALL A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition. Provides the highest consistency and the lowest availability of any other level.
EACH_QUORUM Strong consistency. A write must be written to the commit log and memtable on a quorum of replica nodes in all datacenter. Used in multiple datacenter clusters to strictly maintain consistency at the same level in each datacenter. For example, choose this level if you want a read to fail when a datacenter is down and the QUORUM cannot be reached on that datacenter.
QUORUM A write must be written to the commit log and memtable on a quorum of replica nodes. Provides strong consistency if you can tolerate some level of failure.
LOCAL_QUORUM Strong consistency. A write must be written to the commit log and memtable on a quorum of replica nodes in the same datacenter as the coordinator. Avoids latency of inter-datacenter communication. Used in multiple datacenter clusters with a rack-aware replica placement strategy, such as NetworkTopologyStrategy, and a properly configured snitch. Use to maintain consistency locally (within the single datacenter). Can be used with SimpleStrategy.
ONE A write must be written to the commit log and memtable of at least one replica node. Satisfies the needs of most users because consistency requirements are not stringent.
TWO A write must be written to the commit log and memtable of at least two replica nodes. Similar to ONE.
THREE A write must be written to the commit log and memtable of at least three replica nodes. Similar to TWO.
LOCAL_ONE A write must be sent to, and successfully acknowledged by, at least one replica node in the local datacenter. In a multiple datacenter clusters, a consistency level of ONE is often desirable, but cross-DC traffic is not. LOCAL_ONE accomplishes this. For security and quality reasons, you can use this consistency level in an offline datacenter to prevent automatic connection to online nodes in other datacenters if an offline node goes down.
ANY A write must be written to at least one node. If all replica nodes for the given partition key are down, the write can still succeed after a hinted handoff has been written. If all replica nodes are down at write time, an ANY write is not readable until the replica nodes for that partition have recovered. Provides low latency and a guarantee that a write never fails. Delivers the lowest consistency and highest availability.
SERIAL Achieves linearizable consistency for lightweight transactions by preventing unconditional updates. You cannot configure this level as a normal consistency level, configured at the driver level using the consistency level field. You configure this level using the serial consistency field as part of the native protocol operation. See failure scenarios.
LOCAL_SERIAL Same as SERIAL but confined to the datacenter. A write must be written conditionally to the commit log and memtable on a quorum of replica nodes in the same datacenter. Same as SERIAL. Used for disaster recovery. See failure scenarios.

SERIAL and LOCAL_SERIAL write failure scenarios

If one of three nodes is down, the Paxos commit fails under the following conditions:
  • CQL query-configured consistency level of ALL
  • Driver-configured serial consistency level of SERIAL
  • Replication factor of 3

A WriteTimeout with a WriteType of CAS occurs and further reads do not see the write. If the node goes down in the middle of the operation instead of before the operation started, the write is committed, the value is written to the live nodes, and a WriteTimeout with a WriteType of SIMPLE occurs.

Under the same conditions, if two of the nodes are down at the beginning of the operation, the Paxos commit fails and nothing is committed. If the two nodes go down after the Paxos proposal is accepted, the write is committed to the remaining live nodes and written there, but a WriteTimeout with WriteType SIMPLE is returned.

Read consistency levels

This table describes read consistency levels in strongest-to-weakest order.

Read Consistency Levels
Level Description Usage
ALL Returns the record after all replicas have responded. The read operation will fail if a replica does not respond. Provides the highest consistency of all levels and the lowest availability of all levels.
EACH_QUORUM Not supported for reads. Not supported for reads.
QUORUM Returns the record after a quorum of replicas has responded from any datacenter. Ensures strong consistency if you can tolerate some level of failure.
LOCAL_QUORUM Returns the record after a quorum of replicas in the current datacenter as the coordinator node has reported. Avoids latency of inter-datacenter communication. Used in multiple datacenter clusters with a rack-aware replica placement strategy ( NetworkTopologyStrategy) and a properly configured snitch. Fails when using SimpleStrategy.
ONE Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to make the other replicas consistent. Provides the highest availability of all the levels if you can tolerate a comparatively high probability of stale data being read. The replicas contacted for reads may not always have the most recent write.
TWO Returns the most recent data from two of the closest replicas. Similar to ONE.
THREE Returns the most recent data from three of the closest replicas. Similar to TWO.
LOCAL_ONE Returns a response from the closest replica in the local datacenter. Same usage as described in the table about write consistency levels.
SERIAL Allows reading the current (and possibly uncommitted) state of data without proposing a new addition or update. If a SERIAL read finds an uncommitted transaction in progress, it will commit the transaction as part of the read. Similar to QUORUM. To read the latest value of a column after a user has invoked a lightweight transaction to write to the column, use SERIAL. Cassandra then checks the inflight lightweight transaction for updates and, if found, returns the latest data.
LOCAL_SERIAL Same as SERIAL, but confined to the datacenter. Similar to LOCAL_QUORUM. Used to achieve linearizable consistency for lightweight transactions.

About the QUORUM levels

The QUORUM level writes to the number of nodes that make up a quorum. A quorum is calculated, and then rounded down to a whole number, as follows:

quorum = (sum_of_replication_factors / 2) + 1

The sum of all the replication_factor settings for each datacenter is the sum_of_replication_factors.

sum_of_replication_factors = datacenter1_RF + datacenter2_RF + . . . + datacentern_RF
Examples:
  • Using a replication factor of 3, a quorum is 2 nodes. The cluster can tolerate 1 replica down.
  • Using a replication factor of 6, a quorum is 4. The cluster can tolerate 2 replicas down.
  • In a two datacenter cluster where each datacenter has a replication factor of 3, a quorum is 4 nodes. The cluster can tolerate 2 replica nodes down.
  • In a five datacenter cluster where two datacenters have a replication factor of 3 and three datacenters have a replication factor of 2, a quorum is 7 nodes.

The more datacenters, the higher number of replica nodes need to respond for a successful operation.

If consistency is a top priority, you can ensure that a read always reflects the most recent write by using the following formula:

(nodes_written + nodes_read) > replication_factor 

For example, if your application is using the QUORUM consistency level for both write and read operations and you are using a replication factor of 3, then this ensures that 2 nodes are always written and 2 nodes are always read. The combination of nodes written and read (4) being greater than the replication factor (3) ensures strong read consistency.

Similar to QUORUM, the LOCAL_QUORUM level is calculated based on the replication factor of the same datacenter as the coordinator node. That is, even if the cluster has more than one datacenter, the quorum is calculated only with local replica nodes.

In EACH_QUORUM, every datacenter in the cluster must reach a quorum based on that datacenter's replication factor in order for the read or write request to succeed. That is, for every datacenter in the cluster a quorum of replica nodes must respond to the coordinator node in order for the read or write request to succeed.

Configuring client consistency levels

You can use a new cqlsh command, CONSISTENCY, to set the consistency level for queries from the current cqlsh session. The WITH CONSISTENCY clause has been removed from CQL commands. You set the consistency level programmatically (at the driver level). For example, call QueryBuilder.insertInto with a setConsistencyLevel argument. The consistency level defaults to ONE for all write and read operations.