• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

CQL for Astra DB Serverless

  • Overview
    • Cassandra structure
    • About CQL
    • CQL quick reference
    • Known issues
  • Getting started
    • Vector Search Quickstart
    • CQL Quickstart
    • SAI Quickstart
  • Planning
    • Data modeling
      • Data modeling concepts
      • Data modeling analysis
  • Connecting
    • cqlsh (startup options)
      • Accessing data using CQL
        • Starting the standalone cqlsh tool
        • Starting cqlsh
    • Developing
      • Introduction to CQL
      • Managing keyspaces
        • Keyspace concepts
        • Check a keyspace
      • Managing tables
        • Table concepts
        • Data types
        • Create a table
        • Check a table
        • Alter a table
        • Truncate a table
        • Drop a table
      • Managing columns
        • Managing columns
        • Create a table column
        • Check column existence
        • Alter a table column
        • Drop a column
        • User-defined type (UDT) overview
          • Create a user-defined type (UDT)
          • Check for user-defined type (UDT) existence
          • Alter a user-defined type (UDT)
          • Drop a user-defined type (UDT)
      • Indexing Tables
        • Indexing concepts
        • Storage-Attached Indexing (SAI) Overview
          • SAI concepts
          • SAI Quickstart
          • SAI FAQ
          • Create SAI index
          • Alter SAI index
          • Drop SAI index
          • Querying with SAI
          • SAI write path and read path
        • Secondary indexes (2i) overview
          • Secondary index (2i) concepts
          • When to use an index
          • Building and maintaining secondary indexes (2i)
          • Create a secondary index (2i)
            • Create a 2i on a collection column
          • Check secondary index (2i) existence
          • Alter a secondary index (2i)
          • Drop a secondary index (2i)
      • Inserting and updating data
        • Insert simple data
        • Set column
        • List fields
        • Map fields
        • Tuple
        • User-defined type (UDT) column
        • Insert static data
        • UUID and timeuuid column
        • Date column
        • Time column
        • Timestamp column
        • Duration column
        • Update counter
        • Lightweight transactions (LWTs)
        • Inserting JSON formatted values
        • Batching inserts and updates
          • Batching inserts, updates, and deletes
          • Good use of BATCH statement
          • Misuse of BATCH statement
        • Importing data from a CSV file
        • Setting the time-to-live (TTL) for value
          • Inserting, updating, and deleting data
      • Deleting data
        • Deleting values from a column or entire row
        • Expiring data with time-to-live
        • Expiring data with TTL example
      • Querying and retrieving data
        • Using a SELECT statement
        • Restricting queries using WHERE clauses
          • Partition keys
          • Clustering columns
          • Retrieval using the IN keyword
          • Sorting and limiting results
        • Returning data from collection columns
        • Aggregating results
        • Using user-defined functions (UDF)
        • Applying user-defined aggregate (UDA) functions
          • Testing performance impact using tracing
        • Formatting query results as JSON
        • Retrieving the write timestamp
        • Using tracing
          • Tracing performance
        • DSE Search index querying
        • Search index filtering best practices
          • Search index syntax
          • search-index:si-query-syntax.adoc#siIndexedColumns
          • search-index:si-query-syntax.adoc#basicSolrSyntax
          • search-index:si-query-syntax.adoc#queriesJSON
          • search-index:si-query-syntax.adoc#siEscapeCharacters
        • Native CQL search queries
        • Identifying the partition key
        • Filtering on terms
          • Filtering on words, phrases, or substrings
          • Fuzzy search
        • UDT query examples
        • Querying a CQL collection set
        • Using date ranges in solr_query
        • Geospatial queries for Point and LineString
        • Spatial queries with polygons require JTS
        • Using dynamic fields
        • Limiting results and paging
        • Restricted query routing
    • Securing overview
      • Security quickstart
      • Using cqlsh with authentication
      • System tables for authentication and authorization
  • Cycling Examples
    • Setting up the cycling keyspace
    • User-defined types (UDTs)
      • basic_info
      • basic_info_withTTL
      • fullname
      • race
    • User-defined functions (UDFs) and user-defined aggregates (UDAs)
      • flog
      • left
      • average_state
      • average_final
      • Cycling user-defined aggregate team_average
    • Tables
      • birthday_list
      • calendar
      • comments
      • country_flag
      • cyclist_alt_stats
      • cyclist_base
      • cyclist_career_teams
      • cyclist_category
      • cyclist_emails
      • cyclist_expenses
      • cyclist_id
      • cyclist_name
      • cyclist_points
      • cyclist_races
      • cyclist_sponsors
      • cyclist_stats
      • cyclist_teams
      • events
      • popular_count
      • race_sponsors
      • race_starts
      • race_times
      • race_winners
      • rank_by_year_and_name
      • team_average
      • upcoming_calendar
    • Materialized views (MVs)
      • cyclist_by_age-mv
    • Search indexes
      • Search index examples
      • comments search index
    • Access control
      • Cycling internal
    • Cycling queries
      • Aggregating cycling data
    • Reference
      • CQL quick reference
      • Understanding the CQL command syntax
      • Data types
        • Creating a blob column
        • Date, time, and timestamp format
      • Operators
      • CQL native functions
      • CQL native aggregates
      • System and virtual tables
        • System tables
          • Keyspace tables
          • Querying system tables
          • Functions, aggregates, and user types
        • Virtual tables
          • Virtual keyspaces and tables
          • Virtual tables for SAI indexes and SSTables
      • cqlsh (startup options)
        • Configuring a cqlshrc file
        • Starting cqlsh on a search node
      • CQL shell (cqlsh) reference
        • CAPTURE
        • CLEAR
        • CONSISTENCY
        • COPY TO
        • COPY FROM
        • DESCRIBE AGGREGATE
        • DESCRIBE CLUSTER
        • DESCRIBE SCHEMA
        • DESCRIBE FUNCTION
        • DESCRIBE INDEX
        • DESCRIBE KEYSPACE
        • DESCRIBE MATERIALIZED VIEW
        • DESCRIBE SEARCH INDEX
        • DESCRIBE TABLE
        • DESCRIBE TYPE
        • EXECUTE AS
        • EXPAND
        • EXIT
        • LOGIN
        • PAGING
        • SERIAL CONSISTENCY
        • SHOW
        • SOURCE
        • TIMING
        • TRACING
      • CQL commands
        • ALTER TABLE
        • ALTER TYPE
        • ALTER USER (Deprecated)
        • BATCH
        • CREATE AGGREGATE
        • CREATE CUSTOM INDEX
        • CREATE INDEX
        • CREATE TABLE
        • CREATE TYPE
        • CREATE USER (Deprecated)
        • DELETE
        • DROP INDEX
        • DROP TABLE
        • DROP TYPE
        • DROP USER (Deprecated)
        • GRANT
        • INSERT
        • RESTRICT
        • REVOKE
        • SELECT
        • TRUNCATE
        • UNRESTRICT
        • UNRESTRICT ROWS
        • UPDATE
        • USE
  • CQL for Astra DB Serverless
  • Reference
  • CQL commands
  • BATCH
Edit this Page

BATCH

Atomicity and Batches

Combines multiple data modification language (DML) statements (such as INSERT, UPDATE, and DELETE) to achieve atomicity and isolation when targeting a single partition, or only atomicity when targeting multiple partitions.

See Batching inserts, updates, and deletes.

A batch applies all DML statements within a single partition before the data is available, ensuring atomicity and isolation. A well-constructed batch targeting a single partition can reduce client-server traffic and more efficiently update a table with a single row mutation.

If there are two different tables in the same keyspace and the two tables have the same partition key, this scenario is considered a single partition batch. There will be a single mutation for each table. This happens because the two tables could have different columns, even though the keyspace and partition are the same. Batches allow a caller to bundle multiple operations into a single batch request. All the operations are performed by the same coordinator. The best use of a batch request is for a single partition in multiple tables in the same keyspace. Also, batches provide a guarantee that mutations will be applied in a particular order.

For multiple partition batches, logging ensures that all DML statements are applied. Either all or none of the batch operations will succeed, ensuring atomicity. Batch isolation occurs only if the batch operation is writing to a single partition.

Only use a multiple partition batch when there is no other viable option, such as asynchronous statements. Multiple partition batches may decrease throughput and increase latency.

Optionally, a batch can apply a client-supplied timestamp. Before implementing or executing a batch see Batching inserts and updates.

Batches are not isolated among client programs. Other client programs can read the first modified rows from the batch while the other remaining statements in the batch are in progress. There is no batch rollback functionality, which means that a batch cannot be undone.

Synopsis

BEGIN [ ( UNLOGGED | LOGGED ) ] BATCH
  [ USING TIMESTAMP [ <epoch_microseconds> ] ]
  <dml_statement> [ USING TIMESTAMP [ <epoch_microseconds> ] ] ;
  [ <dml_statement> [ USING TIMESTAMP [ <epoch_microseconds> ] ] [ ; ... ] ] ;
  APPLY BATCH ;

A batch can contain these <dml_statements>:

  • INSERT

  • UPDATE

  • DELETE

UNLOGGED | COUNTER

If UNLOGGED is not specified, the batch is logged. If multiple partitions are involved, batches are logged by default. A logged batch ensures that all or none of the batch operations succeed (atomicity). First the serialized batch is written to the batchlog system table which consumes the serialized batch as blob data. After a successful write, the rows are persisted (or hinted) and the batchlog data is removed. Logging incurs a performance penalty, the batchlog is written to two other nodes. Options for thresholds, warning about or failure due to batch size, are available.

UNLOGGED runs the batch without logging penalties. Unlogged batching issues a warning when too many operations or too many partitions are involved. Single partition batch operations are unlogged by default, and are the only unlogged batch operations recommended.

Although a logged batch enforces atomicity (that is, it guarantees if all DML statements in the batch succeed or none do), there is no other transactional enforcement at the batch level. For example, there is no batch isolation unless the batch operation is writing to a single partition. In multiple partition batch operations, clients are able to read the first updated rows from the batch, while other rows are still being updated on the server. In single partition batch operations, clients cannot read a partial update from any row until the batch is completed.

Use the COUNTER option for batched counter updates. Unlike other updates, counter updates are not idempotent.

USING TIMESTAMPS

Sets the write time for transactions executed in a BATCH.

Restriction: USING TIMESTAMP does not support LWT (lightweight transactions), such as DML statements that have an IF NOT EXISTS clause.

By default, the database applies the same timestamp to all data modified by the batch; therefore statement order does not matter within a batch, thus a batch statement is not very useful for writing data that must be timestamped in a particular order. Use client-supplied timestamps to achieve a particular order.

User-defined timestamp

Specify the epoch time in microseconds after USING TIMESTAMP [ <epoch_microseconds> ]. When the time is not specified, the database inserts the current time.

To use the same timestamp for all DMLs, insert on first line of batch:
BEGIN BATCH USING TIMESTAMP [ <epoch_microseconds> ]
  DML_statement1 ;
  DML_statement2 ;
  DML_statement3 ;
APPLY BATCH ;

For individual transactions, insert at the end of a DML:

BEGIN BATCH
  <DML_statement1> ;
  <DML_statement2> USING TIMESTAMP [ <epoch_microseconds> ] ;
  <DML_statement3> ;
APPLY BATCH ;

Examples

This section uses the cyclist_expenses and popular_count tables.

Applying a client supplied timestamp to all DMLs

Insert meals paid for Vera Adrian using the user-defined date when inserting the records:

BEGIN BATCH

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, balance
  ) VALUES (
    'Vera ADRIAN', 0
  ) IF NOT EXISTS;

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Vera ADRIAN', 1, 7.95, 'Breakfast', false
  );

APPLY BATCH;
BEGIN BATCH

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Vera ADRIAN', 2, 13.44, 'Lunch', true
  );

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Vera ADRIAN', 3, 25.00, 'Dinner', false
  );

  UPDATE cycling.cyclist_expenses
  SET balance = -32.95
  WHERE cyclist_name = 'Vera ADRIAN'
  IF balance = -7.95;

APPLY BATCH;
BEGIN BATCH

  UPDATE cycling.cyclist_expenses
  SET balance = 0
  WHERE cyclist_name = 'Vera ADRIAN'
  IF balance = -32.95;

  UPDATE cycling.cyclist_expenses
  SET paid = true
  WHERE cyclist_name = 'Vera ADRIAN'
  AND expense_id = 1 IF paid = false;

  UPDATE cycling.cyclist_expenses
  SET paid = true
  WHERE cyclist_name = 'Vera ADRIAN'
  AND expense_id = 3
  IF paid = false;

APPLY BATCH;
BEGIN BATCH

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'John SMITH', 3, 15.00, 'Lunch', false
  );

  INSERT INTO cycling.cyclist_name (
    id, lastname, firstname
  ) VALUES (
    6ab09bec-e68e-48d9-a5f8-97e6fb4c9b12, 'SMITH', 'John'
  );

APPLY BATCH;
BEGIN BATCH USING TIMESTAMP 1481124356754405

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Vera ADRIAN', 2, 13.44, 'Lunch', true
  );

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Vera ADRIAN', 3, 25.00, 'Dinner', true
  );

APPLY BATCH;

Combining two statements for the same partition results in a single table mutation.

View the records vertically in cqlsh:

EXPAND ON

Verify that the timestamps are all the same:

SELECT
  cyclist_name, expense_id,
  amount, WRITETIME(amount),
  description, WRITETIME(description),
  paid, WRITETIME(paid)
FROM cycling.cyclist_expenses
WHERE cyclist_name = 'Vera ADRIAN';

Both records were entered with the same timestamp.

@ Row 1
------------------------+------------------
 cyclist_name           | Vera ADRIAN
 expense_id             | 2
 amount                 | 13.44
 writetime(amount)      | 1481124356754405
 description            | Lunch
 writetime(description) | 1481124356754405
 paid                   | True
 writetime(paid)        | 1481124356754405

@ Row 2
------------------------+------------------
 cyclist_name           | Vera ADRIAN
 expense_id             | 3
 amount                 | 25
 writetime(amount)      | 1481124356754405
 description            | Dinner
 writetime(description) | 1481124356754405
 paid                   | True
 writetime(paid)        | 1481124356754405

(2 rows)

If any DML statement in the batch uses compare-and-set (CAS) logic, an error is returned. For example, the following batch with the CAS IF NOT EXISTS option returns an error:

BEGIN BATCH USING TIMESTAMP 1481124356754405

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Vera ADRIAN', 2, 13.44, 'Lunch', true
  );

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Vera ADRIAN', 3, 25.00, 'Dinner', false
  ) IF NOT EXISTS;

APPLY BATCH;
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Cannot provide custom timestamp for conditional BATCH"

Batching conditional updates

Batch conditional updates introduced as lightweight transactions. However, a batch containing conditional updates can operate only within a single partition, because the underlying Paxos implementation only works at partition-level granularity. If one statement in a batch is a conditional update, the conditional logic must return true, or the entire batch fails. If the batch contains two or more conditional updates, all the conditions must return true, or the entire batch fails.

The following example shows batching of conditional updates. The first statement uses the IF NOT EXISTS conditional clause.

BEGIN BATCH

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id
  ) VALUES (
    'Joe WALLS', 1
  )
  IF NOT EXISTS;

  INSERT INTO cycling.cyclist_expenses (
    cyclist_name, expense_id, amount, description, paid
  ) VALUES (
    'Joe WALLS', 1, 8, 'burrito', false
  );

APPLY BATCH;

Conditional batches cannot provide custom timestamps. UPDATE and DELETE statements within a conditional batch cannot use IN conditions to filter rows.

A continuation of this example shows how to use a static column with conditional updates in batch.

Batching counter updates

A batch of counters should use the COUNTER option because, unlike other writes in DataStax Enterprise, a counter update is not an idempotent operation.

BEGIN COUNTER BATCH

  UPDATE cycling.popular_count
  SET popularity = popularity + 1
  WHERE id = 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47;

  UPDATE cycling.popular_count
  SET popularity = popularity + 125
  WHERE id = 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47;

  UPDATE cycling.popular_count
  SET popularity = popularity - 64
  WHERE id = 6ab09bec-e68e-48d9-a5f8-97e6fb4c9b47;

APPLY BATCH;

Counter batches cannot include non-counter columns in the DML statements, just as a non-counter batch cannot include counter columns. Counter batch statements cannot provide custom timestamps.

ALTER USER (Deprecated) CREATE AGGREGATE

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage