Configuring data consistency
Consistency levels in Cassandra can be configured to manage availability versus data accuracy.
Consistency levels in Cassandra can be configured to manage availability versus data accuracy. You can configure consistency on a cluster, datacenter, or individual I/O operation basis. Consistency among participating nodes can be set globally and also controlled on a per-operation basis (for example insert or update) using Cassandra’s drivers and client libraries.
Write consistency levels
This table describes the write consistency levels in strongest-to-weakest order.
Level | Description | Usage |
---|---|---|
ALL |
A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition. | Provides the highest consistency and the lowest availability of any other level. |
EACH_QUORUM |
Strong consistency. A write must be written to the commit log and memtable on a quorum of replica nodes in all datacenter. | Used in multiple datacenter clusters to strictly maintain consistency at the same
level in each datacenter. For example, choose this level if you want a read to fail when a
datacenter is down and the QUORUM cannot be reached on that datacenter.
|
QUORUM |
A write must be written to the commit log and memtable on a quorum of replica nodes. | Provides strong consistency if you can tolerate some level of failure. |
LOCAL_QUORUM |
Strong consistency. A write must be written to the commit log and memtable on a quorum of replica nodes in the same datacenter as the coordinator. Avoids latency of inter-datacenter communication. | Used in multiple datacenter clusters with a rack-aware replica placement strategy, such as NetworkTopologyStrategy, and a properly configured snitch. Use to maintain consistency locally (within the single datacenter). Can be used with SimpleStrategy. |
ONE |
A write must be written to the commit log and memtable of at least one replica node. | Satisfies the needs of most users because consistency requirements are not stringent. |
TWO |
A write must be written to the commit log and memtable of at least two replica nodes. | Similar to ONE . |
THREE |
A write must be written to the commit log and memtable of at least three replica nodes. | Similar to TWO . |
LOCAL_ONE |
A write must be sent to, and successfully acknowledged by, at least one replica node in the local datacenter. | In a multiple datacenter clusters, a consistency level of ONE is
often desirable, but cross-DC traffic is not. LOCAL_ONE accomplishes this.
For security and quality reasons, you can use this consistency level in an offline
datacenter to prevent automatic connection to online nodes in other datacenters if an
offline node goes down. |
ANY |
A write must be written to at least one node. If all replica nodes for the given
partition key are down, the write can still succeed after a hinted handoff has been written. If all replica nodes are down at write time, an
ANY write is not readable until the replica nodes for that partition have
recovered. |
Provides low latency and a guarantee that a write never fails. Delivers the lowest consistency and highest availability. |
SERIAL |
Achieves linearizable consistency for lightweight transactions by preventing unconditional updates. | You cannot configure this level as a normal consistency level, configured at the driver level using the consistency level field. You configure this level using the serial consistency field as part of the native protocol operation. See failure scenarios. |
LOCAL_SERIAL |
Same as SERIAL but confined to the datacenter. A write must be written conditionally to the commit log and memtable on a quorum of replica nodes in the same datacenter. | Same as SERIAL. Used for disaster recovery. See failure scenarios. |
SERIAL and LOCAL_SERIAL write failure scenarios
- CQL query-configured consistency level of ALL
- Driver-configured serial consistency level of SERIAL
- Replication factor of 3
A WriteTimeout with a WriteType of CAS occurs and further reads do not see the write. If the node goes down in the middle of the operation instead of before the operation started, the write is committed, the value is written to the live nodes, and a WriteTimeout with a WriteType of SIMPLE occurs.
Under the same conditions, if two of the nodes are down at the beginning of the operation, the Paxos commit fails and nothing is committed. If the two nodes go down after the Paxos proposal is accepted, the write is committed to the remaining live nodes and written there, but a WriteTimeout with WriteType SIMPLE is returned.
Read consistency levels
This table describes read consistency levels in strongest-to-weakest order.
Level | Description | Usage |
---|---|---|
ALL |
Returns the record after all replicas have responded. The read operation will fail if a replica does not respond. | Provides the highest consistency of all levels and the lowest availability of all levels. |
EACH_QUORUM |
Not supported for reads. | Not supported for reads. |
QUORUM |
Returns the record after a quorum of replicas has responded from any datacenter. | Ensures strong consistency if you can tolerate some level of failure. |
LOCAL_QUORUM |
Returns the record after a quorum of replicas in the current datacenter as the coordinator node has reported. Avoids latency of inter-datacenter communication. | Used in multiple datacenter clusters with a rack-aware replica placement strategy (
NetworkTopologyStrategy ) and a properly configured snitch. Fails when
using SimpleStrategy . |
ONE |
Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to make the other replicas consistent. | Provides the highest availability of all the levels if you can tolerate a comparatively high probability of stale data being read. The replicas contacted for reads may not always have the most recent write. |
TWO |
Returns the most recent data from two of the closest replicas. | Similar to ONE . |
THREE |
Returns the most recent data from three of the closest replicas. | Similar to TWO . |
LOCAL_ONE |
Returns a response from the closest replica in the local datacenter. | Same usage as described in the table about write consistency levels. |
SERIAL |
Allows reading the current (and possibly uncommitted) state of
data without proposing a new addition or update. If a SERIAL read finds an
uncommitted transaction in progress, it will commit the transaction as part of the read.
Similar to QUORUM. |
To read the latest value of a column after a user has invoked a lightweight transaction to write to the column, use
SERIAL . Cassandra then checks the inflight lightweight transaction for
updates and, if found, returns the latest data. |
LOCAL_SERIAL |
Same as SERIAL , but confined to the datacenter. Similar to
LOCAL_QUORUM. |
Used to achieve linearizable consistency for lightweight transactions. |
About the QUORUM levels
The QUORUM
level writes to the number of nodes that make up a quorum. A
quorum is calculated, and then rounded down to a whole number, as follows:
quorum = (sum_of_replication_factors / 2) + 1
The sum of all the replication_factor
settings for each datacenter is the
sum_of_replication_factors
.
sum_of_replication_factors = datacenter1_RF + datacenter2_RF + . . . + datacentern_RF
- Using a replication factor of 3, a quorum is 2 nodes. The cluster can tolerate 1 replica down.
- Using a replication factor of 6, a quorum is 4. The cluster can tolerate 2 replicas down.
- In a two datacenter cluster where each datacenter has a replication factor of 3, a quorum is 4 nodes. The cluster can tolerate 2 replica nodes down.
- In a five datacenter cluster where two datacenters have a replication factor of 3 and three datacenters have a replication factor of 2, a quorum is 7 nodes.
The more datacenters, the higher number of replica nodes need to respond for a successful operation.
If consistency is a top priority, you can ensure that a read always reflects the most recent write by using the following formula:
(nodes_written + nodes_read) > replication_factor
For example, if your application is using the QUORUM
consistency level for
both write and read operations and you are using a replication factor of 3, then this ensures
that 2 nodes are always written and 2 nodes are always read. The combination of nodes written
and read (4) being greater than the replication factor (3) ensures strong read consistency.
Similar to QUORUM
, the LOCAL_QUORUM
level is calculated
based on the replication factor of the same datacenter as the coordinator node. That is, even
if the cluster has more than one datacenter, the quorum is calculated only with local replica
nodes.
In EACH_QUORUM
, every datacenter in the cluster must reach a quorum based on
that datacenter's replication factor in order for the read or write request to succeed. That
is, for every datacenter in the cluster a quorum of replica nodes must respond to the
coordinator node in order for the read or write request to succeed.
Configuring client consistency levels
You can use a new cqlsh command, CONSISTENCY, to set the consistency level for queries from the
current cqlsh session. The WITH CONSISTENCY
clause has been removed from
CQL commands. You set the consistency level programmatically (at the driver level). For
example, call QueryBuilder.insertInto
with a
setConsistencyLevel
argument. The consistency level defaults to
ONE
for all write and read operations.