BATCH
Write multiple DML statements.
Write multiple DML statements.
Synopsis
Apache Cassandra™ 2.1 and later
BEGIN UNLOGGED BATCH
USING TIMESTAMP timestamp
dml_statement;
dml_statement;
...
APPLY BATCH;
dml_statement is:
- INSERT
- UPDATE
- DELETE
A semicolon that terminates CQL statements is not included in the synopsis. |
Description
A BATCH statement combines multiple data modification language (DML) statements (INSERT, UPDATE, DELETE) into a single logical operation, and sets a client-supplied timestamp for all columns written by the statements in the batch. Batching multiple statements can save network exchanges between the client/server and server coordinator/replicas. However, because of the distributed nature of Cassandra, spread requests across nearby nodes as much as possible to optimize performance. Using batches to optimize performance is usually not successful, as described in Using and misusing batches section. For information about the fastest way to load data, see "Cassandra: Batch loading without the Batch keyword."
Batches are atomic by default. In the context of a Cassandra batch operation, atomic means that if any of the batch succeeds, all of it will. To achieve atomicity, Cassandra first writes the serialized batch to the batchlog system table that consumes the serialized batch as blob data. When the rows in the batch have been successfully written and persisted (or hinted) the batchlog data is removed. There is a performance penalty for atomicity. If you do not want to incur this penalty, prevent Cassandra from writing to the batchlog system by using the UNLOGGED option: BEGIN UNLOGGED BATCH
Although an atomic batch guarantees that if any part of the batch succeeds, all of it will, no other transactional enforcement is done at the batch level. For example, there is no batch isolation. Clients are able to read the first updated rows from the batch, while other rows are still being updated on the server. However, transactional row updates within a partition key are isolated: clients cannot read a partial update.
Statement order does not matter within a batch; Cassandra applies all rows using the same timestamp. Use client-supplied timestamps to achieve a particular order.
Using a timestamp
BATCH supports setting a client-supplied timestamp, an integer, in the USING clause with one exception: if a DML statement in the batch contains a compare-and-set (CAS) statement, such as the following statement, do not attempt to use a timestamp:
INSERT INTO users (id, lastname) VALUES (999, 'Sparrow') IF NOT EXISTS
The timestamp applies to all statements in the batch. If not specified, the current time of the insertion (in microseconds) is used. The individual DML statements inside a BATCH can specify a timestamp if one is not specified in the USING clause.
For example, specify a timestamp in an INSERT statement.
BEGIN BATCH
INSERT INTO purchases (user, balance) VALUES ('user1', -8) USING TIMESTAMP 19998889022757000;
INSERT INTO purchases (user, expense_id, amount, description, paid)
VALUES ('user1', 1, 8, 'burrito', false);
APPLY BATCH;
SELECT balance, WRITETIME(balance) FROM PURCHASES;
balance | writetime_balance
---------+-------------------
-8 | 19998889022757000
Batching conditional updates
Only updates made to the same partition can be included in the batch because the underlying Paxos implementation works at the granularity of the partition. You can group updates that have conditions with those that do not, but when a single statement in a batch uses a condition, the entire batch is committed using a single Paxos proposal, as if all of the conditions contained in the batch apply. This example shows batching of conditional updates:
The statements for inserting values into purchase records use the IF conditional clause.
BEGIN BATCH
INSERT INTO purchases (user, balance) VALUES ('user1', -8) IF NOT EXISTS;
INSERT INTO purchases (user, expense_id, amount, description, paid)
VALUES ('user1', 1, 8, 'burrito', false);
APPLY BATCH;
BEGIN BATCH
UPDATE purchases SET balance = -208 WHERE user='user1' IF balance = -8;
INSERT INTO purchases (user, expense_id, amount, description, paid)
VALUES ('user1', 2, 200, 'hotel room', false);
APPLY BATCH;
A continuation of this example shows how to use a static column with conditional updates in batch.
Batching counter updates
In Cassandra 2.1 and later, batches of counters should use UNLOGGED because, unlike other writes in Cassandra, counter updates are not an idempotent operation.
Cassandra 2.1 Example
BEGIN UNLOGGED BATCH
UPDATE UserActionCounts SET total = total + 2 WHERE keyalias = 523;
UPDATE AdminActionCounts SET total = total + 2 WHERE keyalias = 701;
APPLY BATCH;