BATCH
Applies multiple data modification language (DML) statements with atomicity and/or in isolation.
Combines multiple data modification language (DML) statements (such as INSERT, UPDATE, and DELETE) to achieve atomicity and isolation when targeting a single partition, or only atomicity when targeting multiple partitions.
For multiple partition batches, logging ensures that all DML statements are applied. Either all or none of the batch operations will succeed, ensuring atomicity. Batch isolation occurs only if the batch operation is writing to a single partition.
Optionally, a batch can apply a client-supplied timestamp. Before implementing or executing a batch see Batching inserts and updates.
Batches are not isolated among client programs. Other client programs can read the first modified rows from the batch while the other remaining statements in the batch are in progress. There is no batch rollback functionality, which means that a batch cannot be undone.
Synopsis
BEGIN [ ( UNLOGGED | COUNTER ) ] BATCH [ USING TIMESTAMP [ epoch_microseconds ] ] dml_statement [ USING TIMESTAMP [ epoch_microseconds ] ] ; [ dml_statement [ USING TIMESTAMP [ epoch_microseconds ] ] [ ; ... ] ] APPLY BATCH ;
Syntax conventions | Description |
---|---|
UPPERCASE | Literal keyword. |
Lowercase | Not literal. |
Italics |
Variable value. Replace with a user-defined value. |
[] |
Optional. Square brackets ( [] ) surround
optional command arguments. Do not type the square brackets. |
( ) |
Group. Parentheses ( ( ) ) identify a group to
choose from. Do not type the parentheses. |
| |
Or. A vertical bar ( | ) separates alternative
elements. Type any one of the elements. Do not type the vertical
bar. |
... |
Repeatable. An ellipsis ( ... ) indicates that
you can repeat the syntax element as often as required. |
'Literal string' |
Single quotation ( ' ) marks must surround
literal strings in CQL statements. Use single quotation marks to
preserve upper case. |
{ key : value
} |
Map collection. Braces ( { } ) enclose map
collections or key value pairs. A colon separates the key and the
value. |
<datatype1,datatype2> |
Set, list, map, or tuple. Angle brackets ( <
> ) enclose data types in a set, list, map, or tuple.
Separate the data types with a comma. |
cql_statement; |
End CQL statement. A semicolon ( ; ) terminates
all CQL statements. |
[--] |
Separate the command line options from the command arguments with
two hyphens ( -- ). This syntax is useful when
arguments might be mistaken for command line options. |
' <schema> ... </schema>
' |
Search CQL only: Single quotation marks ( ' )
surround an entire XML schema declaration. |
@xml_entity='xml_entity_type' |
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrConfig files. |
A batch can contain these dml_statements:
- UNLOGGED | COUNTER
-
If
UNLOGGED
is not specified, the batch is logged. If multiple partitions are involved, batches are logged by default. A logged batch ensures that all or none of the batch operations succeed (atomicity). First the serialized batch is written to the batchlog system table which consumes the serialized batch as blob data. After a successful write, the rows are persisted (or hinted) and the batchlog data is removed. Logging incurs a performance penalty, the batchlog is written to two other nodes. Options for thresholds, warning about or failure due to batch size, are available.UNLOGGED
runs the batch without logging penalties. Unlogged batching issues a warning when too many operations or too many partitions are involved. Single partition batch operations are unlogged by default, and are the only unlogged batch operations recommended.Although a logged batch enforces atomicity (that is, it guarantees if all DML statements in the batch succeed or none do), there is no other transactional enforcement at the batch level. For example, there is no batch isolation unless the batch operation is writing to a single partition. In multiple partition batch operations, clients are able to read the first updated rows from the batch, while other rows are still being updated on the server. In single partition batch operations, clients cannot read a partial update from any row until the batch is completed.
Use the
COUNTER
option for batched counter updates. Unlike other updates, counter updates are not idempotent. - USING TIMESTAMPS
-
Sets the write time for transactions executed in a BATCH.Restriction:
USING TIMESTAMP
does not support LWT (lightweight transactions), such as DML statements that have anIF NOT EXISTS
clause.By default, the database applies the same timestamp to all data modified by the batch; therefore statement order does not matter within a batch, thus a batch statement is not very useful for writing data that must be timestamped in a particular order. Use client-supplied timestamps to achieve a particular order.
User-defined timestamp
Specify the epoch time in microseconds after
USING TIMESTAMP [epoch_microseconds]
. When the time is not specified, the database inserts the current time.To use the same timestamp for all DMLs, insert on first line of batch:BEGIN BATCH USING TIMESTAMP [ epoch_microseconds ] DML_statement1 ; DML_statement2 ; DML_statement3 ; APPLY BATCH ;
For individual transactions, insert at the end of a DML:BEGIN BATCH DML_statement1 ; DML_statement2 USING TIMESTAMP [ epoch_microseconds ] ; DML_statement3 ; APPLY BATCH ;
Examples
This section uses the cyclist_expenses and popular_count tables.
Applying a client supplied timestamp to all DMLs
BEGIN BATCH USING TIMESTAMP 1481124356754405 INSERT INTO cycling.cyclist_expenses ( cyclist_name, expense_id, amount, description, paid ) VALUES ( 'Vera ADRIAN', 2, 13.44, 'Lunch', true ); INSERT INTO cycling.cyclist_expenses ( cyclist_name, expense_id, amount, description, paid ) VALUES ( 'Vera ADRIAN', 3, 25.00, 'Dinner', true ); APPLY BATCH;
cqlsh
:EXPAND ON
SELECT cyclist_name, expense_id, amount, WRITETIME(amount), description, WRITETIME(description), paid, WRITETIME(paid) FROM cycling.cyclist_expenses WHERE cyclist_name = 'Vera ADRIAN';
@ Row 1
------------------------+------------------
cyclist_name | Vera ADRIAN
expense_id | 2
amount | 13.44
writetime(amount) | 1481124356754405
description | Lunch
writetime(description) | 1481124356754405
paid | True
writetime(paid) | 1481124356754405
@ Row 2
------------------------+------------------
cyclist_name | Vera ADRIAN
expense_id | 3
amount | 25
writetime(amount) | 1481124356754405
description | Dinner
writetime(description) | 1481124356754405
paid | True
writetime(paid) | 1481124356754405
(2 rows)
If any DML statement in the batch uses compare-and-set (CAS) logic, an error is returned.
For example, the following batch with the CAS IF NOT EXISTS
option
returns an error:
BEGIN BATCH USING TIMESTAMP 1481124356754405 INSERT INTO cycling.cyclist_expenses ( cyclist_name, expense_id, amount, description, paid ) VALUES ( 'Vera ADRIAN', 2, 13.44, 'Lunch', true ); INSERT INTO cycling.cyclist_expenses ( cyclist_name, expense_id, amount, description, paid ) VALUES ( 'Vera ADRIAN', 3, 25.00, 'Dinner', false ) IF NOT EXISTS; APPLY BATCH;
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Cannot provide custom timestamp for conditional BATCH"
Batching conditional updates
Batch conditional updates introduced as lightweight transactions. However, a batch containing conditional updates can operate only within a single partition, because the underlying Paxos implementation only works at partition-level granularity. If one statement in a batch is a conditional update, the conditional logic must return true, or the entire batch fails. If the batch contains two or more conditional updates, all the conditions must return true, or the entire batch fails.
IF NOT EXISTS
conditional
clause.BEGIN BATCH INSERT INTO cycling.cyclist_expenses ( cyclist_name, expense_id ) VALUES ( 'Joe WALLS', 1 ) IF NOT EXISTS; INSERT INTO cycling.cyclist_expenses ( cyclist_name, expense_id, amount, description, paid ) VALUES ( 'Joe WALLS', 1, 8, 'burrito', false ); APPLY BATCH;Conditional batches cannot provide custom timestamps.
UPDATE
and
DELETE
statements within a conditional batch cannot use
IN
conditions to filter rows.A continuation of this example shows how to use a static column with conditional updates in batch.
Batching counter updates
COUNTER
option because, unlike other
writes in DataStax Enterprise, a counter update is not an idempotent operation.
BEGIN COUNTER BATCH UPDATE cycling.popular_count SET popularity = popularity + 1 WHERE id = 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47; UPDATE cycling.popular_count SET popularity = popularity + 125 WHERE id = 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47; UPDATE cycling.popular_count SET popularity = popularity - 64 WHERE id = 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47; APPLY BATCH;
Counter batches cannot include non-counter columns in the DML statements, just as a non-counter batch cannot include counter columns. Counter batch statements cannot provide custom timestamps.