Intro to Astra DB APIs

The Data API and DevOps API are the official way to programmatically interact with Astra DB Serverless.

Throughout the Astra DB Serverless documentation, you can find instructions that use the DevOps API, the Data API clients (Python, TypeScript, and Java), as well as HTTP, Astra CLI, and the Astra Portal. The examples provided depend on the task described.

In addition to the method and command examples provided in the Astra DB Serverless documentation, you can also find links to generated reference documentation for the clients and the APIs.

Prerequisites

To use the Astra DB APIs, you need the following:

Export environment variables

As you become familiar with the Astra DB APIs, you will notice common values that are frequently reused, such as application tokens, database API endpoints, database IDs, keyspace names, and collection names. To learn how to get application tokens and database API endpoints, see Manage application tokens.

Store these values in environment variables to facilitate reuse in your scripts, simplify token rotation, and increase security by keeping sensitive information separate from your core application code. Additionally, DataStax recommends that you follow industry best practices when storing sensitive values, such as tokens.

You can use any method you prefer to set environment variables. The following code samples are for example purposes only.

  • Linux or macOS

  • Windows

export ASTRA_DB_API_ENDPOINT=API_ENDPOINT
export ASTRA_DB_APPLICATION_TOKEN=TOKEN
set ASTRA_DB_API_ENDPOINT=API_ENDPOINT
set ASTRA_DB_APPLICATION_TOKEN=TOKEN

For information about storing and using environment variables in scripts, see the documentation for your preferred language or operating system.

Optional curl arguments

The Data API and DevOps API curl examples throughout the Astra DB Serverless documentation include some optional arguments, such as -s, -sS, and | jq. You can omit or modify these arguments as needed.

Filepaths assume *nix

Throughout the Astra DB Serverless documentation, filepaths assume a *nix environment.

If you use Microsoft Windows, you might need to adjust the filepaths given in the examples.

DevOps API

Use the DevOps API to perform lifecycle actions on organizations and databases in Astra DB Serverless.

For more information, see Get started with the Astra DevOps API.

Data API

Use the Data API to perform actions on Astra DB Serverless (Vector) databases and the collections and data (documents) within those collections.

The Data API is a schema-less, document-based, modern API that provides easy and intuitive access to structured vector data in your Serverless (Vector) databases. It leverages the scalability, performance, and real-time indexing capabilities of Apache Cassandra® to support GenAI application development. To get the most relevant results possible, you can execute vector search queries, apply complex document filtering, or both.

The Data API is an entry point for Astra DB Serverless to integrate with the GenAI ecosystem, which includes tools like LangChain, LlamaIndex, and a variety of embedding providers.

Use the Data API and clients to create applications that interact with your Serverless (Vector) databases, including a variety of query and update operators to filter documents and sort response data, as well as vector search commands that return similarity scores.

For more information, see the following:

Data API clients

You can interact with the Data API directly through HTTP or use one of the Astra DB Data API clients for Python, TypeScript, or Java:

Language Client Version Dependency Documentation

python Python

astrapy

Latest release

Python 3.8 or later

Get started with the Data API

typescript TypeScript

astra-db-ts

Latest release

Node.js 18 or later

Get started with the Data API

java Java

astra-db-java

Latest release

Java 11 or later

Get started with the Data API

When you create apps using the Data API Python, TypeScript, and Java clients, your main entry point is to instantiate a DataAPIClient object. It’s the conceptual start of the overall coding hierarchy:

Conceptually separate from the coding hierarchy are the administration objects you use for database administration:

Clients can spawn specific objects for use in subsequent interactions.

For more information, see the following:

Naming conventions

Astra DB has the following naming conventions for databases, keyspaces, collections, tables, and vectorize API keys:

  • Must start and end with a letter or number.

  • Can contain uppercase letters A-Z, lowercase letters a-z, numbers 0-9, and underscores (_). Some components allow additional special characters.

  • Must contain at least two characters.

  • Can’t exceed the maximum character limit for the entity type:

    • Keyspaces: 48 characters.

    • Databases, collections, tables, and vectorize API keys: 50 characters.

Astra DB APIs use the term keyspace to refer to both namespaces and keyspaces.

The Data API has the following naming conventions for document properties:

  • Must start and end with a letter or an underscore (_).

  • Can contain uppercase letters A-Z, lowercase letters a-z, numbers 0-9, and underscores (_).

  • Must contain at least one character.

  • Can’t exceed 48 characters.

  • Can’t be exactly _id, which is reserved and interpreted as a document’s identity property.

The dollar sign ($) is reserved for system-defined operator and property names, such as $exists, $and, $or, and $vector.

Data types

The Data API supports the following data types:

  • String

  • Number

  • Object (JSON object)

  • Array

  • Boolean

  • Vector (through $vector)

  • Date (through $date)

  • Null

  • UUID (through $uuid)

  • ObjectId (through $objectId)

If you’re using a Data API client, consult the client reference for details on working with dates, UUIDs, and ObjectIDs.

Limits

The Data API includes guardrails to ensure best practices, foster availability, and promote optimal configurations for your Astra DB Serverless databases.

Entity Limit Notes

Number of collections per database

5 or 10

Serverless (Vector) databases created after June 24, 2024 can have up to 10 collections. Databases created before this date can have up to 5 collections. The collection limit is based on Storage Attached Indexing (SAI). For more information, see The indexing option.

Page size

20

For certain operations, a page may contain up to 20 documents. After that per-page maximum is reached, you can load any additional documents on the next page:

  • For clients, you must iterate over a cursor.

  • For HTTP, you must use the nextPageState ID returned by paginated Data API responses.

Some operations, such as deleteMany and vector ANN search, don’t return a cursor or nextPageState. For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit. For deleteMany, clients automatically issue multiple HTTP requests until all matching documents are deleted. HTTP requests delete 20 documents per request without returning a nextPageState. Repeat the HTTP request until the response indicates that fewer than 20 documents were deleted.

Sort page size

100

Document page size for sorting; implemented as separate from page size because sort operations need more rows per page.

Maximum property name length

100

Maximum of 100 characters in a document property name.

Maximum path length

1,000

Maximum of 1,000 characters in a path name; total for all segments, including any dots (.) between properties in a path.

Maximum indexed string property size in bytes

8,000

Maximum of 8,000 bytes (UTF-8 encoded) for string length in an indexed property. The Data API uses UTF-8 encoding regardless of the original encoding in the request.

Maximum number property length

100

Maximum of 100 characters for the length of a number type property.

Maximum elements per array

1,000

Maximum number of elements in an array. This limit applies to indexed properties only.

Maximum dimensions in vector-enabled collection

4,096

Maximum size of dimensions you can define for a vector-enabled collection.

Maximum number of properties per JSON object

1,000

Maximum number of properties for a JSON object. This limit applies to indexed properties only.

A given JSON object may have nested objects, also known as sub-documents. This maximum total count of 1,000 refers to all the indexed properties in the main document, plus a count of 1 for each sub-document (if any).

Maximum number of properties per JSON document

2,000

Maximum number of properties allowed in a single JSON document is 2,000.

This limit includes intermediate properties as well as leaf properties. For example, the following document has three properties that apply to this limit: root, root.branch, and root.branch.leaf.

{
  "root": {
    "branch": {
      "leaf": 42
    }
  }
}

Maximum document size in characters

4 million

Maximum size of each document in a collection is 4 million characters.

Maximum inserted batch size in characters

20 million

Maximum size of an entire batch of documents submitted via an insertMany or updateMany command is 20 million characters.

Maximum number of documents deleted per transaction

20

Maximum number of documents that can be deleted in each transaction.

Maximum number of documents updated per transaction

20

Maximum number of documents that can be updated in each transaction.

Maximum number of documents inserted per transaction

100

Maximum number of documents that can be inserted in each transaction when using insertMany.

Maximum size _id values array via $in

100

Maximum size of an _id values array that can be sent via the $in operator.

Maximum number of documents returned with each vector search

1,000

Maximum number of documents returned with each vector search.

Exceeded limit returns 200 OK with error

If your request is valid but the command exceeds a limit, the Data API responds with HTTP 200 OK and an error message.

It is also possible to receive a response containing both data and errors. Always inspect the response for error messages.

For example, if you exceed the per-transaction limit of 100 documents in an insertMany command, the Data API response contains the following message:

{
  "errors": [
    {
      "message": "Request invalid: field 'command.documents' value \"[...]\" not valid. Problem: amount of documents to insert is over the max limit (101 vs 100).",
      "errorCode": "COMMAND_FIELD_INVALID"
    }
  ]
}

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com