Get started with the Python driver

Astra DB Serverless databases support the Data API and clients as well as Cassandra drivers. You can use the Data API to run CQL statements on tables in Astra DB Serverless databases.

DataStax recommends Cassandra drivers for legacy applications that rely on a driver or workloads that require specific CQL functions that aren’t supported by the Data API

Because Astra DB Serverless is based on Apache Cassandra®, you can use Cassandra drivers to connect to your Astra databases.

This quickstart explains how to install a driver, connect it to your Astra database, and then send some CQL statements to the database.

To use the Python driver, you need to choose a compatible version, install the driver and its dependencies, and then connect the driver to your Astra database. Once connected, you can write scripts that use the driver to run commands against your database.

Python driver ownership and other important changes in version 3.30

Starting with version 3.30, the Python driver is maintained by the Apache Software Foundation (ASF). Prior versions were maintained by DataStax.

As of version 3.30, there is no change to the Python driver package name. You can still install the driver with pip install cassandra-driver.

However, there are important changes in version 3.30 that you should be aware of, including use of pyproject.toml, the DRIVER_NAME in STARTUP messages, and the supported Python versions. For more information, see the Python driver upgrade guide.

Python driver compatibility

DataStax officially supports the latest 12 months of releases, and DataStax recommends using the latest driver version whenever possible. Compatibility isn’t guaranteed for earlier versions. For upgrade guides and compatibility information for earlier versions, see Unsupported drivers.

New features and bug fixes are developed on the latest minor version of the driver, and users are encouraged to stay current with those minor releases. APIs are maintained stable according to semantic versioning conventions, and upgrades should be trivial.

Unless otherwise specified, compatibility version ranges include all patch versions. For example, a range of 4.0 to 4.3 includes all versions from 4.0.0 to the last 4.3.z release.

Python driver compatibility
Driver version	Astra compatibility	Comments
3.28 and later	Fully compatible	Starting with version 3.30, this driver is maintained by the Apache Software Foundation (ASF).
3.20 to 3.27	Partially compatible	Doesn’t support the vector type.
Earlier versions	Not compatible

Prepare the environment and database

Install Python version 3.10 or later.

This quickstart uses the latest version of the Python driver. For Python support in earlier versions, see the Python driver documentation.
Install pip version 23.0 or later.
Create a database.
Download your database’s Secure Connect Bundle (SCB).

For more information, including connections to multi-region databases, see The SCB and encrypted connections for drivers.
Set the following environment variables:
- DATABASE_ID: The database ID.
- APPLICATION_TOKEN: An application token with the Database Administrator role.
  
  For more information, see Authentication methods for drivers.

Authentication methods for drivers

You use an application token and a Secure Connect Bundle (SCB) to connect a driver to an Astra database.

The application token authenticates the driver to the database, and the token’s role determines the actions that the driver is authorized to perform on the database. When you generate a token, the token details include a clientId, secret, and token:

{
  "clientId": "CLIENT_ID",
  "secret": "CLIENT_SECRET",
  "token": "APPLICATION_TOKEN"
}

clientId and secret are legacy authentication methods that predate token.
token is a unified token that comprises everything you need for Astra token authentication.

Cassandra drivers use username and password authentication for Astra connections, typically through an authentication class or argument, such as PlainTextAuthProvider. To set the username and password for a Cassandra driver connection, you can use either the unified token or the legacy clientId and secret:

Unified token authentication (Recommended)

To authenticate with the unified application token, set the username to the literal string token, and set the password to your unified application token. For example:

("token", "APPLICATION_TOKEN")

Legacy clientId and secret authentication

For legacy applications and older driver versions that don’t use unified application tokens, you can use the clientId as the username and the secret as the password. For example:

("CLIENT_ID", "SECRET")

However, if you are using a legacy token created prior to the introduction of the unified token format, DataStax recommends rotating these tokens due to their age.

In addition to the application token, you must provide an SCB to set contact points and establish a secure connection to your database. For more information, see The SCB and encrypted connections for drivers.

The SCB and encrypted connections for drivers

In addition to an application token, you must provide an SCB to set contact points and provide certificates necessary to establish a secure mutual TLS (mTLS) connection to your database.

To establish an encrypted connection between your application and database, the driver uses the SSL certificates and trusted certificate authorities (CAs) in the SCB to verify the Astra server’s identity. Mechanically, when the driver receives the server’s SSL certificate during the SSL handshake, it checks that the certificate was signed by one of the registered CAs. If the certificate wasn’t signed by a registered CA, the driver checks that the signer was signed by one of the registered CAs. It continues through the signers until it finds one that is in the list of trusted CAs. If there are no matches, then identity verification fails and the driver connection isn’t established.

All Astra-compatible drivers have configuration file attributes, builder methods, or constructor parameters to use the SCB. In your driver configuration, you set the path to the SCB zip file, and then the driver automatically gets the required information and files from the SCB. When using an SCB, don’t set any options that are inferred from the SCB, such as contact points and SSL encryption settings. Additionally, don’t extract the SCB zip file; it must be provided to the driver as an unextracted archive.

For multi-region databases, you need the region-specific SCB for each region that your application will connect to.

To connect to one region of a multi-region database, download the SCB for a region that is geographically close to your application to reduce latency.

To connect to multiple regions or databases in the same application, download the SCB for each region or database. Then, in your application’s code, create one root driver instance (session or cluster) for each region or database, using custom logic to select the appropriate SCB for each instance. For more information, see Best practices: Session and cluster handling and Connection pools and initial contact points.

DataStax recommends that you use a driver version that supports SCB authentication for simplified configuration and reduced chance of connection failures. However, if you must support a legacy application with an earlier driver, you can use cql-proxy, extract the SCB, and then manually provide the required certificates to the driver. Additionally, you must use the token’s clientId and secret for the username and password, respectively. For an example, see DataStax Ruby and PHP drivers (Maintenance).

Install the Python driver

Install the Python driver:

pip install cassandra-driver

If you install an earlier version of the driver, make sure your version is compatible with Astra and your application’s CQL statements. For example, if you need to query vector data, make sure your driver version supports the vector type.

Optional: Verify the installation:
```
pip show cassandra-driver
```
Make sure the returned Version is the latest version or the specific version that you installed.

Connect the Python driver

In the root of your Python project, create a connect_database.py file:
```
cd python_project
touch connect_database.py
```
Copy one of the following connection code examples into connect_database.py.

Both examples create a Cluster instance to connect to your Astra database. You typically have one Cluster for each Astra database, and only one Session for the entire application. For more information, see Best practices: Session and cluster handling and Connection pools and initial contact points.
Production configuration (recommended)
When using the Python driver in production environments or with simulated production workloads, DataStax recommends robust session configuration with profile and cluster details to help optimize driver performance.

The following example uses authentication credentials stored in environment variables, and it sets options for connection timeout, request timeout, and protocol version.

connect_database.py

import os from cassandra.cluster import Cluster, ExecutionProfile, EXEC_PROFILE_DEFAULT, ProtocolVersion from cassandra.auth import PlainTextAuthProvider import json cloud_config= { 'secure_connect_bundle': "PATH/TO/SCB.zip", 'connect_timeout': 30 } auth_provider=PlainTextAuthProvider("token", os.environ["APPLICATION_TOKEN"]) profile = ExecutionProfile(request_timeout=30) cluster = Cluster( cloud=cloud_config, auth_provider=auth_provider, execution_profiles={EXEC_PROFILE_DEFAULT: profile}, protocol_version=ProtocolVersion.V4 ) session = cluster.connect()

Replace PATH/TO/SCB.zip with the absolute path to your database’s Secure Connect Bundle (SCB) zip file (secure-connect-DATABASE_NAME.zip).
Minimal configuration
You can use a minimal session configuration for testing or lower environments where you don’t need to optimize the cluster details for production workloads.

The following example uses authentication credentials stored in environment variables and default values for all other connection options.

connect_database.py

import os from cassandra.cluster import Cluster from cassandra.auth import PlainTextAuthProvider import json session = Cluster( cloud={"secure_connect_bundle": "PATH/TO/SCB.zip"}, auth_provider=PlainTextAuthProvider("token", os.environ["APPLICATION_TOKEN"]), ).connect()

Replace PATH/TO/SCB.zip with the absolute path to your database’s Secure Connect Bundle (SCB) zip file (secure-connect-DATABASE_NAME.zip).
To test the connection, add a simple query to the script.

The following example queries the system.local table. You can replace the example SELECT statement with any CQL statement that you want to run against a keyspace and table in your database.
connect_database.py
```
row = session.execute("select release_version from system.local").one()
if row:
    print(row[0])
else:
    print("An error occurred.")
```
Save and run your Python script:
```
python ./connect_database.py
```
If you ran the example SELECT statement on the system.local table, then the cluster_name value from the system.local table is printed to the console if the script runs successfully.

Next, you can extend or modify this script to run other commands against your database or connect to other databases. For more information, see the documentation for your version of the Python driver:

Run a vector search with the Python driver

The following example shows how you can use the Python driver to index vector data and then run a vector search:

Create a table and vector index.

The following code creates a table named vector_test with columns for an integer id, text, and a 5-dimensional vector. Then, it creates a custom index on the vector column using dot product similarity function for efficient vector searches.

This example uses a keyspace named default_keyspace. Replace this value if you want to use a different keyspace.

keyspace = "default_keyspace"
v_dimension = 5

session.execute((
    "CREATE TABLE IF NOT EXISTS {keyspace}.vector_test (id INT PRIMARY KEY, "
    "text TEXT, vector VECTOR<FLOAT,{v_dimension}>);"
).format(keyspace=keyspace, v_dimension=v_dimension))

session.execute((
    "CREATE CUSTOM INDEX IF NOT EXISTS idx_vector_test "
    "ON {keyspace}.vector_test "
    "(vector) USING 'StorageAttachedIndex' WITH OPTIONS = "
    "{{'similarity_function' : 'cosine'}};"
).format(keyspace=keyspace))

Insert vector data.

The following code inserts some rows with embeddings into the vector_test table:

text_blocks = [
    (1, "Chat bot integrated sneakers that talk to you", [0.1, 0.15, 0.3, 0.12, 0.05]),
    (2, "An AI quilt to help you sleep forever", [0.45, 0.09, 0.01, 0.2, 0.11]),
    (3, "A deep learning display that controls your mood", [0.1, 0.05, 0.08, 0.3, 0.6]),
]
for block in text_blocks:
    id, text, vector = block
    session.execute(
        f"INSERT INTO {keyspace}.vector_test (id, text, vector) VALUES (%s, %s, %s)",
        (id, text, vector)
    )

Perform a vector search.

The following code performs a vector search to find rows that are close to a specific vector embedding:

ann_query = (
    f"SELECT id, text, similarity_cosine(vector, [0.15, 0.1, 0.1, 0.35, 0.55]) as sim FROM {keyspace}.vector_test "
    "ORDER BY vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55] LIMIT 2"
)
for row in session.execute(ann_query):
    print(f"[{row.id}] \"{row.text}\" (sim: {row.sim:.4f})")

Reconnect the Python driver after a migration

If you migrate your data from one Cassandra database platform to another, you must update your client applications to connect to your new databases.

At minimum, you must update the driver connection strings. Additional changes might be required if you upgraded to a new major driver version or migrated to a database platform with a different feature set. For example, if you migrate to Astra, your drivers cannot create keyspaces because CQL for Astra doesn’t support CREATE KEYSPACE.

For information about updating driver connections after a migration, see the DataStax migration documentation on Connecting client applications to your new target database. Although the referenced documentation is in the context of zero downtime migration, the information applies to most migrations between Cassandra-based databases where you need to update the driver connection strings for the new database.

The following steps summarize the process for updating your driver connection strings after you migrate to Astra:

Prepare the environment and database.
Install the Python driver.

The latest version is recommended, but you can use any Astra-compatible version.

In your existing Python driver code, modify the connection code to use the SCB and token authentication:

import os
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
import json

cloud_config= {
        'secure_connect_bundle': 'PATH/TO/SCB.zip'
        }
auth_provider = PlainTextAuthProvider("token", os.environ["APPLICATION_TOKEN"])
cluster = Cluster(cloud=cloud_config, auth_provider=auth_provider)
session = cluster.connect()

For more information, see Connect the Python driver.

Run your Python script.