API reference overview

The Data API is one way to programmatically interact with Hyper-Converged Database. DataStax provides clients to make it easier to use the Data API. See the extensive examples of methods and commands.

The other method of interacting with Hyper-Converged Database is through the CQL API. CQL-compatible drivers can be used to interact with Hyper-Converged Database databases. See the CQL API reference and the individual driver documentation under the Drivers navigation item.

The Data API and its AI-enabled, multi-region functionality is designed for Hyper-Converged Database databases. The Data API and client apps offer high-demand features such as vector searches with AI projections that return similarity scores. The Data API eliminates the need for complex data modeling, and enables you to start coding applications quickly. This capability is critical in the rapidly evolving Generative AI (GenAI) field. Data API leverages the scalability, performance, and real-time indexing capabilities of Hyper-Converged Database and Apache Cassandra®.

Also, the DataStax RAGStack implementation is based on the Data API. RAGStack is a curated stack of the best open-source software for easing implementation of the Retrieval-Augmented Generation (RAG) pattern in production-ready applications using Hyper-Converged Database or Apache Cassandra as a vector store.

When you create apps using the Python, TypeScript, and Java Data API clients, as your main entry point you will instantiate a DataAPIClient object. It is conceptually at the top of the overall coding hierarchy:

The client can spawn specific objects for use in various types of subsequent interactions.

Within each reference topic, task-based sections present per-language examples in adjacent tabs. Depending on a given task’s context, the tabs may include Python, TypeScript, Java, cURL, and CLI.

Also included in this overview are the Data API:

Prerequisites

In order to use the API, you must have completed the following step:

  • You have created a HCD namespace/keyspace and vector-enabled table.

Data API

Use the Data API to perform actions on namespaces, collections, and documents in Hyper-Converged Database.

See also:

Python client

AstraPy is the official Python client for Hyper-Converged Database. It requires Python 3.8+

Check out the project on GitHub.

For detailed examples, refer to the Python tab in the following topics:

TypeScript client

astra-db-ts is the official TypeScript client for Hyper-Converged Database. It requires Node.js v16.20.2 or higher. Download and install Node.js.

Check out the project on GitHub.

For detailed examples, refer to the TypeScript tab in the following topics:

Java client

Astra-db-java is the official Java client for Hyper-Converged Database. It requires Java 11+.

Check out the project on GitHub.

For detailed examples, refer to the Java tab in the following topics:

Naming conventions

Property names must start and end with a letter or an underscore, and may only contain the following characters:

  • a-z

  • A-Z

  • 0-9

  • _ (underscore)

Names must be between 1 and 48 characters.

The _id property is reserved and interpreted as a document’s identity property.

The dollar sign $ is reserved for system-defined operator and property names. For example, $exists, $and, $or, and $vector.

Data types

Supported data types in Data API:

  • String

  • Number

  • Object (JSON object)

  • Array

  • Boolean

  • Vector (via $vector)

  • Date (via $date)

  • Null

  • UUID (via $uuid)

  • ObjectId (via $objectId)

If you’re using a client, consult the appropriate reference on how to work with dates, UUIDs and ObjectIDs.

Limits

The Data API includes guardrails to ensure best practices, foster availability, and promote optimal configurations for your Hyper-Converged Database databases.

Entity Limit Notes

Number of collections per namespace

Five

Up to five collections in a HCD namespace.

Page size

20

A page may contain up to 20 documents. After that per-page maximum is reached, you can load any additional documents on the next page via the nextPageState generated ID found in a JSON API command’s response.

Sort page size

100

Document page size for sorting; implemented as separate from page size because sort operations need more rows per page.

Maximum property name

100

Maximum of 100 characters in a property name.

Maximum path length

1,000

Maximum of 1,000 characters in a path name; total for all segments, including any dots (.) between properties in a path.

String property maximum bytes

8,000

Maximum of 8,000 UTF-8 bytes for string length in an indexed property.

Number property maximum characters

100

Maximum of 100 characters for number length in a property.

Maximum elements per array

1,000

Maximum number of elements in an array. This limit applies to indexed properties only. This limit is ignored for non-indexed properties.

Maximum dimensions in vector-enabled collection

4,096

Maximum size of dimensions you can define for a vector-enabled collection.

Maximum number of properties per JSON object

1,000

Maximum number of properties for a JSON object. This limit applies to indexed properties only. This limit is ignored for non-indexed properties.

A given JSON object may have nested objects, also known as sub-documents. This maximum total count of 1,000 refers to all the indexed properties in the main document, plus a count of 1 for each sub-document (if any).

Maximum number of properties per JSON document

2,000

Maximum number of properties allowed in a single JSON document is 2,000. This limit includes intermediate properties as well as leaf properties. For example, given this document:

{
  "root": {
    "branch": {
      "leaf": 42
    }
  }
}

For the purposes of the limit, the document has three properties: root, root.branch, and root.branch.leaf.

Maximum document size in characters

4 million

Maximum size of each document in a collection is 4 million characters.

Maximum inserted batch size in characters

20 million

Maximum size of an entire batch of documents submitted via an insertMany or updateMany command is 20 million characters.

Maximum number of documents deleted per transaction

20

Maximum number of documents that can be deleted in each transaction.

Maximum number of documents updated per transaction

20

Maximum number of documents that can be updated in each transaction.

Maximum number of documents inserted per transaction

20

Maximum number of documents that can be inserted in each transaction when using insertMany.

Maximum size _id values array via $in

100

Maximum size of an _id values array that can be sent via the $in operator.

Maximum number of documents returned with each vector search

1,000

Maximum number of documents returned with each vector search.

If your code exceeds a limit, Data API still responds with an HTTP 200 OK status, but the returned JSON is different from the SUCCESS case. You should inspect the resulting JSON for any error messages. For example, if you exceed the per-transaction limit of 20 documents in an insertMany command, Data API responds with this message:

[{"message": "Request invalid, the property postCommand.command.documents not valid:
amount of documents to insert is over the max limit (21 vs 20)."}]

The SUCCESS response would contain a message such as:

({"status": {"insertedIds": [ ... ] } })

Operators

Data API provides a diverse range of logical and update operators that you can use in filters.

For examples in Data API request payloads, see the cURL examples in Documents reference. Also see the Data API non-Astra vector collection in Postman.

Operator type Name Purpose

Logical query

$and

Joins query clauses with a logical AND, returning the documents that match the conditions of both clauses.

$or

Joins query clauses with a logical OR, returning the documents that match the conditions of either clause.

$not

Returns documents that do not match the conditions of the filter clause.

Range query

$gt

Matches documents where the given property is greater than the specified value.

$gte

Matches documents where the given property is greater than or equal to the specified value.

$lt

Matches documents where the given property is less than the specified value.

$lte

Matches documents where the given property is less than or equal to the specified value.

Comparison query

$eq

Matches documents where the value of a property equals the specified value. This is the default when you do not specify an operator.

$ne

Matches documents where the value of a property does not equal the specified value.

$in

Matches any of the values specified in the array.

$nin

Matches any of the values that are NOT IN the array.

Element query

$exists

Matches documents that have the specified property.

Array query

$all

Matches arrays that contain all elements in the specified array.

$size

Selects documents where the array has the specified number of elements.

Property update

$currentDate

Used in an update operation. In the following example, the createdAt property is updated to use the current date:

{
  "findOneAndUpdate": {
    "filter" : {"_id" : "doc1"},
    "update" : {
      "$currentDate": {
        "createdAt": true
        }
      }
    }
}

$inc

Increments the value of the property by the specified amount.

$min

Updates the property only if the specified value is less than the existing property value.

$max

Updates the property only if the specified value is greater than the existing property value.

$mul

Multiply the value of a property in the document. Example:

{
    "findOneAndUpdate": {
        "filter": {
            "_id": "upsert-id"
        },
        "update": {
            "$currentDate": {
                "field": true
            },
            "$mul": {
                "min_col": 5.2
            }
        },
        "options": {
            "returnDocument": "after"
        }
    }
}

$rename

Renames the specified property in each matching document.

$set

Sets the value of a property in each matching document.

$setOnInsert

Set the value of a property in the document if an upsert is performed. Example:

{
    "findOneAndUpdate": {
        "filter": {
            "_id": "upsert-id"
        },
        "update": {
            "$currentDate": {
                "field": true
            },
            "$setOnInsert": {
                "customer.name": "James B."
            }
        },
        "options": {
            "returnDocument": "after"
        }
    }
}

$unset

Removes the specified property from each matching document.

Array update

$addToSet

Adds elements to the array only if they do not already exist in the set.

$pop

Removes the first or last item of the array, depending on the value of the operator (-1 to remove the first item; 1 to remove the last item).

$push

Adds or appends data to the end of the property value. Or, if the value is not yet an array: * If the property has no value, creates a one-element array (containing the item given). * If the property has a non-array value, creates a two-element array, with the old value as the first entry, and the specified item as the second entry.

$each

An array update that modifies the $push and $addToSet operators to append multiple items for array updates.

$position

An array update that modifies the $push operator to specify the position in the array to add elements.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com