Integrate NVIDIA as an embedding provider

Integrate the Astra-hosted NVIDIA embedding provider to enable Astra DB vectorize.

Prerequisites

To use NVIDIA as an embedding provider, you need the following:

  • An active Astra account with permission to create collections.

  • A Serverless (Vector) database in the AWS us-east-2 region.

    Only databases in the AWS us-east-2 region can use the built-in NVIDIA embedding provider integration.

    If this is your first time using Astra DB, follow the Quickstart to create a database and connect to it with an API client.

Add the NVIDIA integration to a new collection

Before you can use the NVIDIA integration to generate embeddings, you must add the integration to a new collection.

You can’t change a collection’s embedding provider or embedding generation method after you create it. To use a different embedding provider, you must create a new collection with a different embedding provider integration.

  • Astra Portal

  • Python

  • TypeScript

  • Java

  • curl

  1. In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.

  2. Click Data Explorer.

  3. In the Namespace field, select the namespace where you want to create the collection, or use the default namespace, which is named default_keyspace.

  4. Click Create Collection.

  5. In the Create collection dialog, enter a name for the collection. Collection names can have no more than 50 characters.

  6. Turn on Vector-enabled collection.

  7. For Embedding generation method, select the NVIDIA embedding provider integration.

  8. Complete the following fields:

    • Embedding model: The model that you want to use to generate embeddings. If only one model is available, it is selected by default.

    • Dimensions: The number of dimensions that you want the generated vectors to have. Typically, the number of dimensions is automatically determined by the model you select.

    • Similarity metric: The method you want to use to calculate vector similarity scores. The available metrics are Cosine, Dot Product, and Euclidean.

  9. Click Create collection.

Use the Python client to create a collection that uses the NVIDIA integration.

Initialize the client

If you haven’t done so already, initialize the client before creating a collection:

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CollectionVectorServiceOptions

# Initialize the client and get a "Database" object
client = DataAPIClient(os.environ["ASTRA_DB_APPLICATION_TOKEN"])
database = client.get_database(os.environ["ASTRA_DB_API_ENDPOINT"])
print(f"* Database: {database.info().name}\n")

Create a collection integrated with NVIDIA:

collection = database.create_collection(
    "COLLECTION_NAME",
    metric=VectorMetric.COSINE,
    service=CollectionVectorServiceOptions(
        provider="nvidia",
        model_name="NV-Embed-QA",
    ),
)
print(f"* Collection: {collection.full_name}\n")

Use the TypeScript client to create a collection that uses the NVIDIA integration.

Initialize the client

If you haven’t done so already, initialize the client before creating a collection:

import { DataAPIClient, VectorDoc, UUID } from '@datastax/astra-db-ts';

const { ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT } = process.env;

// Initialize the client and get a 'Db' object
const client = new DataAPIClient(ASTRA_DB_APPLICATION_TOKEN);
const db = client.db(ASTRA_DB_API_ENDPOINT);

console.log(`* Connected to DB ${db.id}`);

Create a collection integrated with NVIDIA:

(async function () {
  const collection = await db.createCollection('COLLECTION_NAME', {
    vector: {
      service: {
        provider: 'nvidia',
        modelName: 'NV-Embed-QA',
      },
    },
  });
  console.log(`* Created collection ${collection.namespace}.${collection.collectionName}`);

Use the Java client to create a collection that uses the NVIDIA integration.

Initialize the client

If you haven’t done so already, initialize the client before creating a collection:

import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.CollectionOptions;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindIterable;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.SimilarityMetric;

import static com.datastax.astra.client.model.SimilarityMetric.COSINE;

public class Quickstart {

  public static void main(String[] args) {
    // Loading Arguments
    String astraToken = System.getenv("ASTRA_DB_APPLICATION_TOKEN");
    String astraApiEndpoint = System.getenv("ASTRA_DB_API_ENDPOINT");

    // Initialize the client
    DataAPIClient client = new DataAPIClient(astraToken);
    System.out.println("Connected to AstraDB");

    Database db = client.getDatabase(astraApiEndpoint);
    System.out.println("Connected to Database.");

Create a collection integrated with NVIDIA:

CollectionOptions.CollectionOptionsBuilder builder = CollectionOptions
       .builder()
       .vectorSimilarity(SimilarityMetric.COSINE)
       .vectorize("nvidia", "NV-Embed-QA");
Collection<Document> collection = db
       .createCollection("COLLECTION_NAME", builder.build());

Use the Data API to create a collection that uses the NVIDIA integration:

curl -sS --location -X POST "$ASTRA_DB_API_ENDPOINT/api/json/v1/default_keyspace" \
--header "Token: $ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {
      "vector": {
        "metric": "cosine",
        "service": {
          "provider": "nvidia",
          "modelName": "NV-Embed-QA"
        }
      }
    }
  }
}' | jq

If you get a Collection Limit Reached or TOO_MANY_INDEXES message, you must delete a collection before you can create a new one.

Serverless (Vector) databases created after June 24, 2024 can have up to 10 collections. Databases created before this date can have up to 5 collections. The collection limit is based on Storage Attached Indexing (SAI).

After you create a collection, load data into the collection.

Load and search data with vectorize

  1. Load vector data into your vectorize-integrated collection.

    When you load structured JSON or CSV data, the Vector Field specifies field to use to generate embeddings with $vectorize.

    The Load Data dialog with Vector Field dropdown expanded.

  2. After loading data, you can perform a similarity search using text, rather than a vector.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com