Quickstart

network_check Beginner

query_builder 15 min

Objective

Learn how to create a new database, connect to your database, load a set of vector embeddings, and perform a similarity search to find vectors that are close to the one in your query.

Create a Serverless (Vector) database

Create an Astra account or sign in to an existing Astra account.
In the Astra Portal, select Databases in the main navigation.
Select Create Database.
In the Create Database dialog, select the Serverless (Vector) deployment type.
In Configuration, enter a meaningful Database name.

You can’t change database names. Make sure the name is human-readable and meaningful. Database names must start and end with an alphanumeric character, and can contain the following special characters: & + - _ ( ) < > . , @.
Select your preferred Provider and Region.

You can select from a limited number of regions if you’re on the Free plan. Regions with a lock icon require that you upgrade to a Pay As You Go plan.
Click Create Database.

You are redirected to your new database’s Overview screen. Your database starts in Pending status before transitioning to Initializing. You’ll receive a notification once your database is initialized.
Ensure the database is in Active status, and then select Generate Token. In the Application Token dialog, click content_paste Copy to copy the token (e.g. AstraCS:WSnyFUhRxsrg…). Store the token in a secure location before closing the dialog.

Your token is automatically assigned the Database Administrator role.
Copy your database’s API endpoint, located under Database Details > API Endpoint (e.g. https://ASTRA_DB_ID-ASTRA_DB_REGION.apps.astra.datastax.com).

Assign your token and API endpoint to environment variables in your terminal.

Linux or macOS
Windows
Google Colab

export ASTRA_DB_API_ENDPOINT=API_ENDPOINT # Your database API endpoint
export ASTRA_DB_APPLICATION_TOKEN=TOKEN # Your database application token

set ASTRA_DB_API_ENDPOINT=API_ENDPOINT # Your database API endpoint

set ASTRA_DB_APPLICATION_TOKEN=TOKEN # Your database application token

import os
os.environ["ASTRA_DB_API_ENDPOINT"] = "API_ENDPOINT" # Your database API endpoint
os.environ["ASTRA_DB_APPLICATION_TOKEN"] = "TOKEN" # Your database application token

Install the client

Install the library for the language and package manager you’re using.

Python
TypeScript
Java

To install the Python client with pip:

Verify that pip is version 23.0 or higher.
```
pip --version
```
Upgrade pip if needed.
```
python -m pip install --upgrade pip
```
Install the astrapy package. You must have Python 3.8 or higher.
```
pip install astrapy
```

To install the TypeScript client:

Verify that Node is version 14 or higher.
```
node --version
```
Use npm or Yarn to install the TypeScript client.
- npm
- Yarn
To install the TypeScript client with npm:
npm install @datastax/astra-db-ts
To install the TypeScript client with Yarn:
Verify that Yarn is version 2.0 or higher.

yarn --version

Install the astra-db-ts package.

yarn add @datastax/astra-db-ts

Use Maven or Gradle to install the Java client.

Maven
Gradle

To install the Java client with Maven:

Install Java 11+ and Maven 3.9+.

Create a pom.xml file in the root of your project.

pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
                             http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.example</groupId>
  <artifactId>test-java-client</artifactId>
  <version>1.0-SNAPSHOT</version>

  <!-- The Java client -->
  <dependencies>
    <dependency>
      <groupId>com.datastax.astra</groupId>
      <artifactId>astra-db-java</artifactId>
      <version>1.0.0</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>exec-maven-plugin</artifactId>
        <version>3.0.0</version>
        <configuration>
          <executable>java</executable>
          <mainClass>com.example.Quickstart</mainClass>
        </configuration>
        <executions>
          <execution>
            <goals>
              <goal>java</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <configuration>
          <source>11</source>
          <target>11</target>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

To install the Java client with Gradle:

Install Java 11+ and Gradle.

Create a build.gradle file in the root of your project.

build.gradle

plugins {
    id 'java'
    id 'application'
}

repositories {
    mavenCentral()
}

dependencies {
    implementation 'com.datastax.astra:astra-db-java:1.0.0'
}

application {
    mainClassName = 'com.example.Quickstart'
}

Initialize the client

Paste the following code into a new file on your computer.

Python
TypeScript
Java

quickstart.py

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.ids import UUID
from astrapy.exceptions import InsertManyException

# Initialize the client and get a "Database" object.
# (The "namespace" parameter is optional if you use "default_keyspace".)
client = DataAPIClient(os.environ["ASTRA_DB_APPLICATION_TOKEN"])
database = client.get_database_by_api_endpoint(
    os.environ["ASTRA_DB_API_ENDPOINT"],
    namespace="default_keyspace",  # can be omitted
)
print(f"* Database: {database.info().name}\n")

Don’t name the file astrapy.py to avoid a namespace collision.

quickstart.ts

import { DataAPIClient, VectorDoc, UUID } from '@datastax/astra-db-ts';

const { ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT } = process.env;

// Initialize the client and get a "Db" object.
// (The keyspace parameter is optional if you use 'default_keyspace'.)
const client = new DataAPIClient(ASTRA_DB_APPLICATION_TOKEN);
const db = client.db(ASTRA_DB_API_ENDPOINT, { namespace: '*NAMESPACE*' });

console.log(`* Connected to DB ${db.id}`);

src/main/java/com/example/Quickstart.java

import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindIterable;

import static com.datastax.astra.client.model.SimilarityMetric.COSINE;

public class Quickstart {

  public static void main(String[] args) {
    // Loading Arguments
    String astraToken = System.getenv("ASTRA_DB_APPLICATION_TOKEN");
    String astraApiEndpoint = System.getenv("ASTRA_DB_API_ENDPOINT");

    // Initialize the client. The keyspace parameter is optional if you use
    // "default_keyspace".
    DataAPIClient client = new DataAPIClient(astraToken);
    System.out.println("Connected to AstraDB");

    Database db = client.getDatabase(astraApiEndpoint, "default_keyspace");
    System.out.println("Connected to Database.");

  }
}

Run the code

Run the code you defined above.

Python
TypeScript
Java

python quickstart.py

npm

npx tsx quickstart.ts

Yarn

yarn dlx tsx quickstart.ts

Maven

mvn clean compile
mvn exec:java -Dexec.mainClass="com.example.Quickstart"

Gradle

gradle build
gradle run

Create a collection

Create a collection in your database. Choose dimensions that match your vector data and pick an appropriate similarity metric: cosine (default), dot_product, or euclidean.

Python
TypeScript
Java

quickstart.py

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.ids import UUID
from astrapy.exceptions import InsertManyException

# Initialize the client and get a "Database" object.
# (The "namespace" parameter is optional if you use "default_keyspace".)
client = DataAPIClient(os.environ["ASTRA_DB_APPLICATION_TOKEN"])
database = client.get_database_by_api_endpoint(
    os.environ["ASTRA_DB_API_ENDPOINT"],
    namespace="default_keyspace",  # can be omitted
)
print(f"* Database: {database.info().name}\n")

# ⬇️ NEW CODE

# Create a collection. The default similarity metric is "cosine".
collection = database.create_collection(
    "vector_test",
    dimension=5,
    metric=VectorMetric.COSINE,  # or simply "cosine"
    check_exists=False,
)
print(f"* Collection: {collection.full_name}\n")

# ⬆️ NEW CODE

quickstart.ts

import { DataAPIClient, VectorDoc, UUID } from '@datastax/astra-db-ts';

const { ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT } = process.env;

// Initialize the client and get a "Db" object.
// (The keyspace parameter is optional if you use 'default_keyspace'.)
const client = new DataAPIClient(ASTRA_DB_APPLICATION_TOKEN);
const db = client.db(ASTRA_DB_API_ENDPOINT, { namespace: '*NAMESPACE*' });

console.log(`* Connected to DB ${db.id}`);

// ⬇️ NEW CODE

// Schema for the collection (VectorDoc adds the $vector field)
interface Idea extends VectorDoc {
  idea: string,
}

(async function () {
  // Create a typed, vector-enabled collection.
  const collection = await db.createCollection<Idea>('vector_test', {
    vector: {
      dimension: 5,
      metric: 'cosine',
    },
    checkExists: false,
  });
  console.log(`* Created collection ${collection.namespace}.${collection.collectionName}`);
})();

// ⬆️ NEW CODE

src/main/java/com/example/Quickstart.java

import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindIterable;

import static com.datastax.astra.client.model.SimilarityMetric.COSINE;

public class Quickstart {

  public static void main(String[] args) {
    // Loading Arguments
    String astraToken = System.getenv("ASTRA_DB_APPLICATION_TOKEN");
    String astraApiEndpoint = System.getenv("ASTRA_DB_API_ENDPOINT");

    // Initialize the client. The keyspace parameter is optional if you use
    // "default_keyspace".
    DataAPIClient client = new DataAPIClient(astraToken);
    System.out.println("Connected to AstraDB");

    Database db = client.getDatabase(astraApiEndpoint, "default_keyspace");
    System.out.println("Connected to Database.");

    // ⬇️ NEW CODE

    // Create a collection. The default similarity metric is cosine.
    Collection<Document> collection = db
            .createCollection("vector_test", 5, COSINE);
    System.out.println("Created a collection");

    // ⬆️ NEW CODE

  }
}

Load vector embeddings

Insert a few documents with embeddings into the collection.

Python
TypeScript
Java

quickstart.py

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.ids import UUID
from astrapy.exceptions import InsertManyException

# Initialize the client and get a "Database" object.
# (The "namespace" parameter is optional if you use "default_keyspace".)
client = DataAPIClient(os.environ["ASTRA_DB_APPLICATION_TOKEN"])
database = client.get_database_by_api_endpoint(
    os.environ["ASTRA_DB_API_ENDPOINT"],
    namespace="default_keyspace",  # can be omitted
)
print(f"* Database: {database.info().name}\n")

# Create a collection. The default similarity metric is "cosine".
collection = database.create_collection(
    "vector_test",
    dimension=5,
    metric=VectorMetric.COSINE,  # or simply "cosine"
    check_exists=False,
)
print(f"* Collection: {collection.full_name}\n")

# ⬇️ NEW CODE

# Insert documents into the collection.
# (UUIDs here are version 7.)
documents = [
    {
        "_id": UUID('018e65c9-df45-7913-89f8-175f28bd7f74'),
        "text": "ChatGPT integrated sneakers that talk to you",
        "$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
    },
    {
        "_id": UUID('018e65c9-e1b7-7048-a593-db452be1e4c2'),
        "text": "An AI quilt to help you sleep forever",
        "$vector": [0.45, 0.09, 0.01, 0.2, 0.11],
    },
    {
        "_id": UUID('018e65c9-e33d-749b-9386-e848739582f0'),
        "text": "A deep learning display that controls your mood",
        "$vector": [0.1, 0.05, 0.08, 0.3, 0.6],
    },
]
try:
    insertion_result = collection.insert_many(documents)
    print(f"* Inserted {len(insertion_result.inserted_ids)} items.\n")
except InsertManyException:
    print("* Documents found on DB already. Let's move on.\n")

# ⬆️ NEW CODE

quickstart.ts

import { DataAPIClient, VectorDoc, UUID } from '@datastax/astra-db-ts';

const { ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT } = process.env;

// Initialize the client and get a "Db" object.
// (The keyspace parameter is optional if you use 'default_keyspace'.)
const client = new DataAPIClient(ASTRA_DB_APPLICATION_TOKEN);
const db = client.db(ASTRA_DB_API_ENDPOINT, { namespace: '*NAMESPACE*' });

console.log(`* Connected to DB ${db.id}`);

// Schema for the collection (VectorDoc adds the $vector field)
interface Idea extends VectorDoc {
  idea: string,
}

(async function () {
  // Create a typed, vector-enabled collection.
  const collection = await db.createCollection<Idea>('vector_test', {
    vector: {
      dimension: 5,
      metric: 'cosine',
    },
    checkExists: false,
  });
  console.log(`* Created collection ${collection.namespace}.${collection.collectionName}`);

  // ⬇️ NEW CODE

  // Insert documents into the collection (using UUIDv7s)
  const documents = [
    {
      _id: new UUID('018e65c9-df45-7913-89f8-175f28bd7f74'),
      text: 'ChatGPT integrated sneakers that talk to you',
      $vector: [0.1, 0.15, 0.3, 0.12, 0.05],
    },
    {
      _id: new UUID('018e65c9-e1b7-7048-a593-db452be1e4c2'),
      text: 'An AI quilt to help you sleep forever',
      $vector: [0.45, 0.09, 0.01, 0.2, 0.11],
    },
    {
      _id: new UUID('018e65c9-e33d-749b-9386-e848739582f0'),
      text: 'A deep learning display that controls your mood',
      $vector: [0.1, 0.05, 0.08, 0.3, 0.6],
    },
  ];

  try {
    const inserted = await collection.insertMany(documents);
    console.log(`* Inserted ${inserted.insertedCount} items.`);
  } catch (e) {
    console.log('* Documents found on DB already. Let\'s move on!');
  }

  // ⬆️ NEW CODE
})();

src/main/java/com/example/Quickstart.java

import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindIterable;

import static com.datastax.astra.client.model.SimilarityMetric.COSINE;

public class Quickstart {

  public static void main(String[] args) {
    // Loading Arguments
    String astraToken = System.getenv("ASTRA_DB_APPLICATION_TOKEN");
    String astraApiEndpoint = System.getenv("ASTRA_DB_API_ENDPOINT");

    // Initialize the client. The keyspace parameter is optional if you use
    // "default_keyspace".
    DataAPIClient client = new DataAPIClient(astraToken);
    System.out.println("Connected to AstraDB");

    Database db = client.getDatabase(astraApiEndpoint, "default_keyspace");
    System.out.println("Connected to Database.");

    // Create a collection. The default similarity metric is cosine.
    Collection<Document> collection = db
            .createCollection("vector_test", 5, COSINE);
    System.out.println("Created a collection");

    // ⬇️ NEW CODE

    // Insert documents into the collection
    collection.insertMany(
            new Document("1")
                    .append("text", "ChatGPT integrated sneakers that talk to you")
                    .vector(new float[]{0.1f, 0.15f, 0.3f, 0.12f, 0.05f}),
            new Document("2")
                    .append("text", "An AI quilt to help you sleep forever")
                    .vector(new float[]{0.45f, 0.09f, 0.01f, 0.2f, 0.11f}),
            new Document("3")
                    .append("text", "A deep learning display that controls your mood")
                    .vector(new float[]{0.1f, 0.05f, 0.08f, 0.3f, 0.6f}));
    System.out.println("Inserted documents into the collection");

    // ⬆️ NEW CODE

  }
}

Use the Data Explorer to inspect your loaded data.

Perform a similarity search

Find documents that are close to a specific vector embedding. (The code also shows the optional step of dropping the collection at the end.)

Python
TypeScript
Java

quickstart.py

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.ids import UUID
from astrapy.exceptions import InsertManyException

# Initialize the client and get a "Database" object.
# (The "namespace" parameter is optional if you use "default_keyspace".)
client = DataAPIClient(os.environ["ASTRA_DB_APPLICATION_TOKEN"])
database = client.get_database_by_api_endpoint(
    os.environ["ASTRA_DB_API_ENDPOINT"],
    namespace="default_keyspace",  # can be omitted
)
print(f"* Database: {database.info().name}\n")

# Create a collection. The default similarity metric is "cosine".
collection = database.create_collection(
    "vector_test",
    dimension=5,
    metric=VectorMetric.COSINE,  # or simply "cosine"
    check_exists=False,
)
print(f"* Collection: {collection.full_name}\n")

# Insert documents into the collection.
# (UUIDs here are version 7.)
documents = [
    {
        "_id": UUID('018e65c9-df45-7913-89f8-175f28bd7f74'),
        "text": "ChatGPT integrated sneakers that talk to you",
        "$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
    },
    {
        "_id": UUID('018e65c9-e1b7-7048-a593-db452be1e4c2'),
        "text": "An AI quilt to help you sleep forever",
        "$vector": [0.45, 0.09, 0.01, 0.2, 0.11],
    },
    {
        "_id": UUID('018e65c9-e33d-749b-9386-e848739582f0'),
        "text": "A deep learning display that controls your mood",
        "$vector": [0.1, 0.05, 0.08, 0.3, 0.6],
    },
]
try:
    insertion_result = collection.insert_many(documents)
    print(f"* Inserted {len(insertion_result.inserted_ids)} items.\n")
except InsertManyException:
    print("* Documents found on DB already. Let's move on.\n")

# ⬇️ NEW CODE

# Perform a similarity search
query = [0.15, 0.1, 0.1, 0.35, 0.55]
results = collection.find(
    {},
    vector=query,
    limit=2,
    projection=["text", "$vector"],
)
print("Vector search results:")
for document in results:
    print("    ", document)

# Cleanup (if desired)
drop_result = collection.drop()
print(f"\nCleanup: {drop_result}\n")

# ⬆️ NEW CODE

quickstart.ts

import { DataAPIClient, VectorDoc, UUID } from '@datastax/astra-db-ts';

const { ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT } = process.env;

// Initialize the client and get a "Db" object.
// (The keyspace parameter is optional if you use 'default_keyspace'.)
const client = new DataAPIClient(ASTRA_DB_APPLICATION_TOKEN);
const db = client.db(ASTRA_DB_API_ENDPOINT, { namespace: '*NAMESPACE*' });

console.log(`* Connected to DB ${db.id}`);

// Schema for the collection (VectorDoc adds the $vector field)
interface Idea extends VectorDoc {
  idea: string,
}

(async function () {
  // Create a typed, vector-enabled collection.
  const collection = await db.createCollection<Idea>('vector_test', {
    vector: {
      dimension: 5,
      metric: 'cosine',
    },
    checkExists: false,
  });
  console.log(`* Created collection ${collection.namespace}.${collection.collectionName}`);

  // Insert documents into the collection (using UUIDv7s)
  const documents = [
    {
      _id: new UUID('018e65c9-df45-7913-89f8-175f28bd7f74'),
      text: 'ChatGPT integrated sneakers that talk to you',
      $vector: [0.1, 0.15, 0.3, 0.12, 0.05],
    },
    {
      _id: new UUID('018e65c9-e1b7-7048-a593-db452be1e4c2'),
      text: 'An AI quilt to help you sleep forever',
      $vector: [0.45, 0.09, 0.01, 0.2, 0.11],
    },
    {
      _id: new UUID('018e65c9-e33d-749b-9386-e848739582f0'),
      text: 'A deep learning display that controls your mood',
      $vector: [0.1, 0.05, 0.08, 0.3, 0.6],
    },
  ];

  try {
    const inserted = await collection.insertMany(documents);
    console.log(`* Inserted ${inserted.insertedCount} items.`);
  } catch (e) {
    console.log('* Documents found on DB already. Let\'s move on!');
  }

  // ⬇️ NEW CODE

  // Perform a similarity search
  const cursor = await collection.find({}, {
    vector: [0.15, 0.1, 0.1, 0.35, 0.55],
    limit: 2,
    includeSimilarity: true,
  });

  console.log('* Search results:')
  for await (const doc of cursor) {
    console.log('  ', doc.text, doc.$similarity);
  }

  // Cleanup (if desired)
  await db.dropCollection('vector_test');
  console.log('* Collection dropped.');

  // Close the client
  await client.close();

  // ⬆️ NEW CODE
})();

src/main/java/com/example/Quickstart.java

import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindIterable;

import static com.datastax.astra.client.model.SimilarityMetric.COSINE;

public class Quickstart {

  public static void main(String[] args) {
    // Loading Arguments
    String astraToken = System.getenv("ASTRA_DB_APPLICATION_TOKEN");
    String astraApiEndpoint = System.getenv("ASTRA_DB_API_ENDPOINT");

    // Initialize the client. The keyspace parameter is optional if you use
    // "default_keyspace".
    DataAPIClient client = new DataAPIClient(astraToken);
    System.out.println("Connected to AstraDB");

    Database db = client.getDatabase(astraApiEndpoint, "default_keyspace");
    System.out.println("Connected to Database.");

    // Create a collection. The default similarity metric is cosine.
    Collection<Document> collection = db
            .createCollection("vector_test", 5, COSINE);
    System.out.println("Created a collection");

    // Insert documents into the collection
    collection.insertMany(
            new Document("1")
                    .append("text", "ChatGPT integrated sneakers that talk to you")
                    .vector(new float[]{0.1f, 0.15f, 0.3f, 0.12f, 0.05f}),
            new Document("2")
                    .append("text", "An AI quilt to help you sleep forever")
                    .vector(new float[]{0.45f, 0.09f, 0.01f, 0.2f, 0.11f}),
            new Document("3")
                    .append("text", "A deep learning display that controls your mood")
                    .vector(new float[]{0.1f, 0.05f, 0.08f, 0.3f, 0.6f}));
    System.out.println("Inserted documents into the collection");

    // ⬇️ NEW CODE

    // Perform a similarity search
    FindIterable<Document> resultsSet = collection.find(
            new float[]{0.15f, 0.1f, 0.1f, 0.35f, 0.55f},
            10
    );
    resultsSet.forEach(System.out::println);

    // Delete the collection
    collection.drop();
    System.out.println("Deleted the collection");

    // ⬆️ NEW CODE

  }
}

You will get a sorted list of the documents you inserted. The database sorts documents by their similarity to the query vector, most similar documents first. The calculation uses cosine similarity by default.

Next steps

Learn more about the Data API, clients, and building your own apps by exploring these resources:

Start with the DataAPIClient

Instantiate a client object. Then learn how to interact with databases, collections, documents, and more.

Build a chatbot with LangChain

Use content scraped from a website to answer questions.

Load your data

Learn how to load your own data. Use the Astra Portal or any of our clients.

Quickstart

Objective

Create a Serverless (Vector) database

Install the client

Initialize the client

Run the code

Create a collection

Load vector embeddings

Perform a similarity search

Next steps

Was this helpful?

Give Feedback