Python client reference
The DataStax Astra DB Serverless (Vector) documentation site is currently in Public Preview and is provided on an “AS IS” basis, without warranty or indemnity of any kind. For more, see the DataStax Preview Terms. |
AstraPy is the official Python client for Astra DB Serverless (Vector). See common usages below, or check out the GitHub repo.
Prerequisites
-
An active Astra account
-
Python 3.7+
Install AstraPy
-
Verify that pip is version 23.0 or higher.
pip --version
-
Upgrade pip if needed.
python -m pip install --upgrade pip
-
Install the AstraPy package.
pip install astrapy
Create a database
-
In the Astra Portal, select Databases in the main navigation.
-
Click Create Database.
-
In the Create Database dialog, select the Serverless (Vector) deployment type.
-
In the Configuration section, enter a name for the new database in the Database name field.
Since database names can’t be changed later, it’s best to name your database something meaningful. Database names must start and end with an alphanumeric character, and may contain only the following special characters:
& + - _ ( ) < > . , @
. -
Select your preferred Provider and Region.
You can select from a limited number of regions if you’re on the Free plan. Regions with a lock icon require that you upgrade to a Pay As You Go plan.
Not all regions may be available. If you don’t see your preferred region listed, please submit a support ticket or send us a message using our live chat in the bottom right of the Astra Portal.
-
Click Create Database.
You are redirected to your new database’s Overview screen. Your database starts in Pending status before transitioning to Initializing. You’ll receive a notification once your database is initialized.
Initialize the client
Import libraries and connect to the database.
import os
from astrapy.db import AstraDB
ASTRA_DB_APPLICATION_TOKEN = os.environ.get("ASTRA_DB_APPLICATION_TOKEN")
ASTRA_DB_API_ENDPOINT= os.environ.get("ASTRA_DB_API_ENDPOINT")
db = AstraDB(
token=ASTRA_DB_APPLICATION_TOKEN,
api_endpoint=ASTRA_DB_API_ENDPOINT,
)
Create a collection
Create an empty collection and define the length of the embeddings.
# Create collection
col = db.create_collection("vector_test", dimension=5, metric="cosine")
Load data
Insert a few documents with embeddings into the vector database.
documents = [
{
"_id": "1",
"text": "ChatGPT integrated sneakers that talk to you",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
},
{
"_id": "2",
"text": "An AI quilt to help you sleep forever",
"$vector": [0.45, 0.09, 0.01, 0.2, 0.11],
},
{
"_id": "3",
"text": "A deep learning display that controls your mood",
"$vector": [0.1, 0.05, 0.08, 0.3, 0.6],
},
]
res = col.insert_many(documents)
Perform a similarity search
Find documents that are close to a specific vector embedding.
query = [0.15, 0.1, 0.1, 0.35, 0.55]
results = col.vector_find(query, limit=2, fields={"text", "$vector"})
for document in results:
print(document)