Data API client upgrade guide

DataStax recommends using the latest versions of the clients to access the newest features, improvements, and bug fixes.

This page describes major changes in specific client versions, including major new features, deprecations, removals, and breaking changes. This page is not a changelog and it does not provide client release notes.

For information about the latest client versions, release notes, installation and upgrade instructions, and other client documentation, see the links in the following table:

Language Client Version Dependency Documentation

Python

astrapy

Latest astrapy release on GitHub

Python 3.8 or later

Get started with the Data API

TypeScript

astra-db-ts

Latest astra-db-ts release on GitHub

Node.js 18 or later

Get started with the Data API

Java

astra-db-java

Latest astra-db-java release on Maven Central

Java 17 or later (21 recommended)

Get started with the Data API

Version 2.0-preview

Client version 2.0-preview is a public preview release. Development is ongoing, and the features and functionality are subject to change. Astra DB Serverless, and the use of such, is subject to the DataStax Preview Terms.

DataStax released clients version 2.0-preview in December 2024 to accompany Data API version 1.0.20.

To install this preview release for the Python and TypeScript clients, you must pass the --pre or @next flags to your package manager. For more information and examples, see Upgrade a client.

Java 17 or later required (Java only)

The Java client now requires Java 17 or later. DataStax recommends Java 21.

New tables methods

Version 2.0-preview of the Data API clients adds support for working with tables in your Serverless (Vector) databases.

Along with this change, this preview release includes new features that make it easier to work with vector data, including support for passing binary-encoded vectors and the new DataAPIVector object. For more information, see Work with rows: Vector type.

Breaking change to create collection signature (Python and Java only)

Version 2.0-preview of the Python and Java Data API clients brings significant changes the signature for the create collection method.

After you upgrade to version 2.0-preview or later, change your code to use the new create collection signature:

  • Python

  • TypeScript

  • Java

  • curl

In version 2.0-preview, the create_collection method no longer accepts single-keyword arguments for each feature of the collection. To maintain consistency with the other clients and with the underlying Data API payload, these individual keywords are replaced by one definition parameter. This parameter is a structure detailing all options for the collection that you want to create.

Explicitly, the definition parameter replaces the following parameters, which are no longer accepted:

  • dimension

  • metric

  • service

  • indexing

  • default_id_type

  • additional_options

The definition parameter is an object of type astrapy.info.CollectionDefinition that you can create either by regular instantiation or through a fluent interface. Alternatively, you can pass a plain dictionary, as long as its structure is compatible with the CollectionDefinition structure.

Additionally, the method signature has changed in the following ways:

  • The timeout-related max_time_ms and collection_max_time_ms parameters are removed. The former is replaced by collection_admin_timeout_ms, consistent with the new timeout options naming conventions.

  • The namespace parameter alias is removed.

  • The check_exists flag is removed.

  • The new document_type formal parameter can convey information on the type hint for the documents in the collection.

  • The new spawn_api_options parameter allows arbitrary customization of the returned Collection object, including the timeout options.

After upgrading to 2.0-preview, adjust your scripts to use the new definition parameter instead of the removed individual parameters, as shown in the following examples:

# Before 2.0-preview
collection = database.create_collection(
    "my_collection",
    dimension=3,
    metric=VectorMetric.DOT_PRODUCT,
    indexing={"deny": ["annotations", "logs"]},
)

# 2.0-preview and later, fluent interface
collection = database.create_collection(
    "my_collection",
    definition=(
        CollectionDefinition.builder()
        .set_vector_dimension(3)
        .set_vector_metric(VectorMetric.DOT_PRODUCT)
        .set_indexing("deny", ["annotations", "logs"])
        .build()
    ),

)

# 2.0-preview and later, CollectionDefinition object
collection = database.create_collection(
    "my_collection",
    definition=CollectionDefinition(
        vector=CollectionVectorOptions(
            dimension=3,
            metric=VectorMetric.DOT_PRODUCT,
        ),
        indexing={"deny": ["annotations", "logs"]},
    ),
)

# 2.0-preview and later, dictionary equivalent to CollectionDefinition
collection = database.create_collection(
    "my_collection",
    definition= {
        "vector": {
            "dimension": 3,
            "metric": VectorMetric.DOT_PRODUCT,
        },
        "indexing": {"deny": ["annotations", "logs"]},
    },
)

For more information and examples, see the client 2.0-preview docstring for Collection and CollectionDefinition.

There is no breaking change in the TypeScript client for the createCollection method.

In version 2.0-preview, the createCollection() method no longer accepts individual arguments, such as dimension and metric, to initialize a collection.

To maintain consistency with the other clients and with the underlying Data API payload, these individual keywords are replaced by one definition parameter. This parameter is a structure detailing all options for the collection that you want to create.

Additionally, the method signature has changed in the following ways:

  • The namespace parameter alias is removed.

  • CollectionOptions is replaced by the CollectionDefinition object, which is a structure detailing all options for the collection that you want to create The builder pattern is replace by a constructor with fluent accessors.

  • The new CreateCollectionOptions parameter allows arbitrary customization of the returned Collection object and the creation operation, including timeout options.

After upgrading to 2.0-preview, adjust your scripts to use the new definition parameter instead of the removed individual arguments, as shown in the following examples:

// Before 2.0-preview
Collection<Document> collectionBefore2 = database
 .createCollection("col1", CollectionOptions
   .builder()
   .vectorDimension(14)
   .vectorSimilarity(SimilarityMetric.COSINE)
   .build());

// 2.0-preview and later
Collection<Document> collection = database
 .createCollection("col1", new CollectionDefinition()
  .vectorDimension(14)
  .vectorSimilarity(SimilarityMetric.COSINE)
);

The following examples demonstrate the new method signature and ways to add parameters to specialize the collection or the options:

// No specialization, default Object Document
Collection<Document> col1 = db
  .createCollection("col1");

// No specialization, custom Document bean
Collection<Bean> col2 = db
  .createCollection("col2", Bean.class);

// Specialized collection, default Document
CollectionDefinition def = new CollectionDefinition()
  .vectorDimension(14)
  .vectorSimilarity(SimilarityMetric.COSINE);
Collection<Document> col3 = db
  .createCollection("col2", def);

// Specialized collection, custom Document bean
Collection<Bean> col4 = db
  .createCollection("col4", def, Bean.class);

// Add options for creation
CreateCollectionOptions options = new CreateCollectionOptions()
    .timeout(30000);
Collection<Bean> col5 = db
  .createCollection("col5", def, Bean.class, options);

This change does not apply to the HTTP createCollection command.

Stricter handling of timestamps and datetimes (Python only)

Version 2.0-preview of the Data API Python client introduces stricter handling of the standard-library datetime.datetime objects for writing to databases. Primarily, naive datetimes are rejected by default because they can’t inherently be mapped to a timestamp.

Replacement of client timeout settings

Version 2.0-preview of the Data API clients deprecates the individual method timeouts, such as max_time_ms, maxTimeMS, and withTimeout, in favor of new timeout options that you can use to set global timeouts and timeouts for individual operation.

The deprecated timeout settings will be removed when client version 2.0 is generally available. At that time, the clients will accept only the new timeout settings.

After you upgrade to version 2.0-preview or later, change your code to use the new timeout settings instead of the deprecated options.

  • Python

  • TypeScript

  • Java

  • curl

The Python client supports several ways for specifying the timeouts associated to the various API operations. You can override the defaults set for an object (such as Collection or Database), and you can set one-off overrides for a single method call.

For a quick migration from the prior max_time_ms parameter, replace max_time_ms with timeout_ms:

# Before 2.0-preview
my_collection.insert_many(..., max_time_ms=40000)

# 2.0-preview and later
my_collection.insert_many(..., timeout_ms=40000)

For more fine-grained control, the Python client offers different "classes" of timeouts that apply to different kinds of operations. Depending on the method being called, the relevant timeouts are enforced, as defined in the timeout portion of the object’s APIOptions.

from astrapy.api_options import APIOptions, TimeoutOptions

my_slow_collection = database.get_collection(
    "reports",
    spawn_api_options=APIOptions(
        timeout_options=TimeoutOptions(
            request_timeout_ms=20000,
            general_method_timeout_ms=40000,
        ),
    ),
)
my_slow_collection.insert_many(...)

You can also specify timeouts for individual method calls by passing the appropriate timeout parameters to the method. Depending on the operation type, one or more timeout parameters can be available. For example:

my_collection.insert_one(..., request_timeout_ms=12000)
my_collection.insert_many(
    ...,
    general_method_timeout_ms=40000,
    request_timeout_ms=12000,
)

my_database_admin.create_keyspace(..., keyspace_admin_timeout_ms=30000)
# Equivalent using timeout_ms.
my_database_admin.create_keyspace(..., timeout_ms=30000)

When multiple parameters are available, timeout_ms is an alias to the broadest timeout setting.

You can find more information in each method’s parameter list.

The TypeScript client supports several ways for specifying the timeouts associated to the various API operations. You can override the default timeout settings for any object in the main client hierarchy (such as Collection or Db), and you can set overrides for individual methods.

For a quick migration from the prior maxTimeMs parameter, replace maxTimeMs with timeout:

// Before 2.0-preview
await collection.insertMany(..., { maxTimeMS: 40000 });

// 2.0-preview and later
await collection.insertMany(..., { timeout: 40000 });

For more fine-grained control, the Python client offers different "classes" of timeouts that apply to different kinds of operations. Depending on the method being called, the relevant timeouts are enforced. For more information, see TypeScript client usage: TimeoutDescriptor.

const mySlowCollection = db.collection("reports", {
  timeoutDefaults: {
    requestTimeoutMs: 20000,
    generalMethodTimeoutMs: 40000,
  },
});

You can also specify timeouts for individual method calls by passing the appropriate timeout parameters to the method. Depending on the operation type, one or two timeout fields can be set in the timeout object. For example:

await collection.insertOne(..., {
  timeout: { requestTimeoutMs: 12000 },
});

await collection.insertMany(..., {
  timeout: {
    generalMethodTimeoutMs: 40000,
    requestTimeoutMs: 12000,
  },
});

// Both equivalent here
await db.createKeyspace(..., { keyspaceAdminTimeoutMs: 30000 });
await db.createKeyspace(..., { timeout: 30000 });

The Java client offers several ways to specify the timeouts for various operations. You can set default timeouts for an object (such as Collection or Database) as well as individual timeouts for a single method call.

Prior to version 2.0-preview, some operations offered a timeout option. However, this option was not universally available, and you couldn’t set fixed defaults.

Within each operation’s Options, you can use the timeout object to set timeouts as long millis or a Duration object:

// Definition timeout at 5000 millis for operation 'findOne'
CollectionFindOneOptions options1 = new CollectionFindOneOptions()
  .timeout(5000L);
collect.findOne(myFilter, options2);

// Same timeout using a Duration object
CollectionFindOneOptions options2 = new CollectionFindOneOptions()
  .timeout(Duration.ofSeconds(5));

For more fine-grained control, the Java client offers different "classes" of timeouts that apply to different kinds of operations. Depending on the method called, the client enforces the associated timeout for that method, as defined in the timeout portion of the object’s APIOptions. For example:

TimeoutOptions fullFledgesTimeouts = new TimeoutOptions()
  .generalMethodTimeoutMillis(50000)
  .requestTimeoutMillis(2000);
CollectionFindOneOptions options3 = new CollectionFindOneOptions()
  .timeoutOptions(fullFledgesTimeouts);
collect.insertMany( ..., options3)

This change does not apply to HTTP.

The ability to set a timeout applies to the clients only. Timeout settings in client scripts set the maximum time that the client waits for a response from the server. It does not set a timeout on the server-side process initiated by the underlying HTTP request.

Removals

Version 2.0-preview of the Data API clients removes the following features that were previously deprecated:

  • The term namespace is replaced by keyspace as of Version 1.5.

  • The Python and TypeScript clients no longer accept id and region when connecting to a database as of Version 1.5.

  • The vector and vectorize fields are no longer accepted as alternatives for $vector and $vectorize.

  • The bulk_write/bulkWrite client method is removed. Use a loop or other standard practice to execute multiple sequential insert operations.

  • The deleteAll client method is replaced by the deleteMany method’s built-in support for emptying a table or collection.

  • The checkExists option is removed from the createCollection method. This option only existed on the client-side. Now, if you attempt to create a collection with the same name as an existing collection, the client surfaces the resulting Data API error only if the existing collection has different settings than the requested new collection.

Version 1.5

DataStax released clients version 1.5 and Data API version 1.0.16 on September 20, 2024.

Deprecation of namespace

Version 1.5 of the Data API clients deprecates namespace in favor of keyspace. In this version, you can use either keyspace or namespace, but you must use one consistently. This change also applies to the Data API itself (HTTP).

This change aligns the Data API and clients with the DevOps API, which already uses keyspace for both namespaces and keyspaces. It also better reflects the underlying Astra DB functionality, in which namespace is effectively an alternative label for keyspace.

Client version 2.0-preview removed support for namespace. After upgrading to version 2.0-preview or later, the clients accept only keyspace.

After you upgrade to version 1.5 or later, change your code to use keyspace instead of namespace. For example:

  • Python

  • TypeScript

  • Java

  • curl

# Before 1.5
database = client.get_database("API_ENDPOINT", namespace="NAMESPACE_OR_KEYSPACE_NAME")

# 1.5 and later
database = client.get_database("API_ENDPOINT", keyspace="NAMESPACE_OR_KEYSPACE_NAME")
// Before 1.5
const db = client.db('API_ENDPOINT', { namespace: 'NAMESPACE_OR_KEYSPACE_NAME' });

// 1.5 and later
const db = client.db('API_ENDPOINT', { keyspace: 'NAMESPACE_OR_KEYSPACE_NAME' });
import java.time.Duration;// Before 1.5
Database db = client.getDatabase(String apiEndpoint, String namespace);

// 1.5 and later
Database db = client.getDatabase(String apiEndpoint, String keyspace);

// Second argument can be a DatabaseOptions to specialize even more the database object
DatabaseOptions dbOptions = new DatabaseOptions(token, options)
 .keyspace(keyspace)
 .token("anotherToken")
 .timeout(Duration.ofSeconds(10));
Database db = client.getDatabase(String apiEndpoint ,dbOptions);

The impact to HTTP requests is minimal. HTTP already accepted either a keyspace or namespace name in the URL path, and most commands used a keyspace parameter.

curl -sS -L -X POST "ASTRA_DB_ENDPOINT/api/json/v1/NAMESPACE_OR_KEYSPACE_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{DATA_API_COMMAND_BODY}'

Astra DB Serverless documentation and client references use keyspace in place of namespace, with the following exceptions:

  • Some preexisting integration guides and tutorials that rely on a subcomponent, such as a sample app, that is unrelated to the Data API and has a namespace object, class, variable, or otherwise.

  • Third-party documentation over which DataStax has no influence.

Deprecation of id and region to specify a database

In version 1.5.1 and later of the Python and TypeScript clients, the API_ENDPOINT is the preferred way to use a DataAPIclient to get a database. The API_ENDPOINT inherently includes the database’s ID and region. As a result, the alternative ID and REGION syntax is deprecated.

Client version 2.0-preview removed support for this usage of ID and REGION in the Python and TypeScript clients. In version 2.0-preview and later, those clients accept only API_ENDPOINT when you use a DataAPIclient to get a database.

This deprecation does not apply to the following:

  • ID and REGION with AstraDBAdmin

  • The Java client

  • HTTP

After you upgrade to version 1.5.1 or later, change your astrapy and astra-db-ts code to use API_ENDPOINT instead of ID and REGION.

  • Python

  • TypeScript

  • Java

  • curl

Change your client.get_database commands to use API_ENDPOINT, instead of ID and REGION.

The following examples show multiple versions of the same command. An actual script would use only one.

# Before 1.5.1, the following are all valid:
database = client.get_database("API_ENDPOINT")
database = client.get_database("ID")
database = client["API_ENDPOINT"]
database = client["ID"]
database = client.get_database("API_ENDPOINT", keyspace="KEYSPACE_NAME")
database = client.get_database("ID", keyspace="KEYSPACE_NAME", region="REGION")

# At 1.5.1 and later, use only 'API_ENDPOINT':
database = client.get_database("API_ENDPOINT")
database = client["API_ENDPOINT"]
database = client.get_database("API_ENDPOINT", keyspace="KEYSPACE_NAME")

Change your client.db commands to use API_ENDPOINT, instead of ID and REGION.

The following examples show multiple versions of the same command. An actual script would use only one.

// Before 1.5.1, the following are all valid:
const db = client.db('API_ENDPOINT');
const db = client.db('ID', 'REGION');
const db = client.db('API_ENDPOINT', { keyspace: 'KEYSPACE_NAME' });
const db = client.db('ID', 'REGION', { keyspace: 'KEYSPACE_NAME' });

// At 1.5.1 and later, use only 'API_ENDPOINT':
const db = client.db('API_ENDPOINT');
const db = client.db('API_ENDPOINT', { keyspace: 'KEYSPACE_NAME' });

This deprecation does not apply to the Java client.

You can continue to use either API_ENDPOINT or ID and REGION:

// Syntax before 1.5
Database db = client.getDatabase(String apiEndpoint);
Database db = client.getDatabase(UUID databaseId, String region);
Database db = client.getDatabase(String apiEndpoint, String keyspace);
Database db = client.getDatabase(UUID databaseId, String region, String keyspace);

// Syntax slightly different after 1.5 (keyspace is now an option)
Database db = client.getDatabase(String apiEndpoint, DatabaseOptions options);
Database db = client.getDatabase(UUID databaseId, DatabaseOptions options);
Database db = client.getDatabase(UUID databaseId, String region, DatabaseOptions options);

This deprecation does not apply to HTTP, which already exclusively uses the API endpoint as the basis of the URL path, such as:

curl -sS -L -X POST "ASTRA_DB_ENDPOINT/api/json/v1/KEYSPACE_NAME"

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com