Work with tables

This Astra DB Serverless feature is currently in public preview. Development is ongoing, and the features and functionality are subject to change. Astra DB Serverless, and the use of such, is subject to the DataStax Preview Terms.

The Data API tables commands are available through HTTP and the clients.

If you use a client, tables commands are available only in client versions 2.0-preview or later. For more information, see Data API client upgrade guide.

You can use the Data API to create, read, write, and drop tables containing structured data in Serverless (Vector) databases.

Tables are best for use cases where your data is structured and adheres to a pre-defined schema, known as a fixed schema. The database enforces schema compliance when you add or update data.

Whereas collections automatically index all fields, tables don’t have comprehensive automatic indexing. To optimize query performance, you must manually create and manage indexes for your table data. For more information about the role of indexes in tables, see Work with table indexes.

If your data doesn’t have a consistent, enforceable structure or you don’t want to manually manage indexes, consider using dynamic schema collections instead.

In Astra DB, tables reside inside keyspaces in databases. For information about managing databases and keyspaces programmatically, see Databases reference.

With the Data API clients, you use the Database class to manage tables themselves. Then, you use the Table class to work with table data.

For information about working with existing CQL tables through the Data API, including the CQL commands and types that the Data API supports, see Migrate from CQL.

Primary keys

The primary key is the unique identifier for rows in a table.

When you create a table, you define a primary key schema consisting of one or more columns. Column defined in the primary key are automatically indexed and available for querying. For more information about the role of indexes in tables, see Work with table indexes.

You cannot use map, list, or set columns in primary keys.

Due to a known issue with filtering on blob columns, DataStax does not recommend using blob columns in primary keys.

There are three types of primary keys that you can define. The type of key you use depends on your data model and the types of queries you plan to run. The format of the primaryKey in your table definition depends on the primary key type.

  • Single-column primary key

  • Composite primary key

  • Compound primary key

A single-column primary key is a primary key consisting of one column.

This option is best for use cases where you usually retrieve rows by a single value. For example, you could use this strategy for a small customer database where every customer is uniquely identified by their email address, and you always look up customers by their email address.

To define a single-column primary key, the primaryKey is the column name as a string:

"primaryKey": "COLUMN_NAME"

For client-specific representations and more examples, see Create a table.

A composite primary key is a primary key consisting of multiple columns. The rows are uniquely identified by the combination of the values from each column.

This strategy can make queries more efficient by creating partitions (groups) of rows based on each primary key column. For example, if your primary key includes country and city, the database has implicit groups of rows with the same country or city, making it more efficient to search for rows in a specific country or city.

This is a moderately complex strategy that allows for more nuanced queries and more complex unique identifiers. It can be useful if your rows are uniquely defined by values from multiple columns or your data falls into natural groupings, such as location or time. For example, you could use this strategy for scenarios such as the following:

  • A manufacturing database that uniquely identifies products by the production date, factory location, and SKU

  • A global customer database that groups customers by country or locality, in addition to an identifier, such as customer ID or email address

For composite primary keys, avoid columns with low cardinality (low diversity of values). For example, a customer database with an overabundance of customers from a single country might not benefit from partitioning by country. Instead, you could use locality identifiers, such as states or postal codes, to break large customer segments into smaller groups for more efficient queries.

To define a composite primary key, the primaryKey is an object containing a partitionBy array that contains the names of the columns to use for partitioning:

"primaryKey": {
  "partitionBy": [
    "COLUMN_NAME", "COLUMN_NAME"
  ]
}

For client-specific representations and more examples, see Create a table.

A compound primary key is a primary key consisting of partition (grouping) columns and clustering (sorting) columns. The rows are uniquely identified by the combination of the values from each column.

This is the most complex partitioning strategy, but it can provide the most flexibility and efficiency for querying data, if it is appropriate for your data model. This strategy can be useful for scenarios where you need to perform range queries or sort time-series data.

For example, assume you have a retail database where each row represents an order, and the orders are partitioned by customer ID and clustered by purchase date. In this case, the database implicitly groups each customer’s order together, and then sorts each customer’s orders by purchase date. This can make it more efficient to retrieve a customer’s most recent orders when they contact customer service or when they check the status of their orders in their account.

You can have multiple partition columns and multiple clustering columns. When clustering on multiple columns, the order you declare the columns matters. For example, if you cluster by date and name, the data is first sorted by the date, and then any rows with the same date are sorted by name.

In compound primary keys, avoid choosing clustering columns with high cardinality (high diversity of values), depending on your data model. For example, a purchase number column may not be ideal for clustering because it is unlikely to contain duplicates. Instead, choose clustering columns with moderate cardinality, such as purchase date, while avoiding columns with extremely low cardinality, such as booleans.

To define a compound primary key, the primaryKey value is an object containing a partitionBy array and a partitionSort object:

  • partitionBy is an array that contains the names of one or more columns to use for partitioning.

  • partitionSort is an object that contains key-value pairs defining clustering columns and the desired sort behavior. -1 indicates descending sorting and 1 indicates ascending sorting.

    If partitionSort has multiple columns, sorting occurs in the order the columns are defined in the partitionSort object.

partitionBy and partitionSort must use different columns.

"primaryKey": {
  "partitionBy": [
    "PARTITION_COLUMN_NAME",
    "PARTITION_COLUMN_NAME"
  ],
  "partitionSort": {
    "SORT_COLUMN_NAME": -1,
    "SORT_COLUMN_NAME": 1
  }
}

For client-specific representations and more examples, see Create a table.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com