Integrate NVIDIA as an embedding provider
Integrate the Astra-hosted NVIDIA embedding provider to enable Astra DB vectorize.
Prerequisites
To use NVIDIA as an embedding provider, you need the following:
-
An active Astra account with permission to create collections.
-
A Serverless (Vector) database in the AWS
us-east-2
region.Only databases in the AWS
us-east-2
region can use the built-in NVIDIA embedding provider integration.If this is your first time using Astra DB, follow the Quickstart to create a database and connect to it with an API client.
Add the NVIDIA integration to a new collection
Before you can use the NVIDIA integration to generate embeddings, you must add the integration to a new collection.
You can’t change a collection’s embedding provider or embedding generation method after you create it. To use a different embedding provider, you must create a new collection with a different embedding provider integration. |
-
Astra Portal
-
Python
-
TypeScript
-
Java
-
curl
-
In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.
-
Click Data Explorer.
-
In the Namespace field, select the namespace where you want to create the collection, or use the default namespace, which is named
default_keyspace
. -
Click Create Collection.
-
In the Create collection dialog, enter a name for the collection. Collection names can contain no more than 50 characters, including letters, numbers, and underscores.
-
Turn on Vector-enabled collection.
-
For Embedding generation method, select the NVIDIA embedding provider integration.
-
Complete the following fields:
-
Embedding model: The model that you want to use to generate embeddings. If only one model is available, it is selected by default.
-
Dimensions: The number of dimensions that you want the generated vectors to have. Typically, the number of dimensions is automatically determined by the model you select.
-
Similarity metric: The method you want to use to calculate vector similarity scores. The available metrics are Cosine, Dot Product, and Euclidean.
-
-
Click Create collection.
Use the Python client to create a collection that uses the NVIDIA integration.
Initialize the client
If you haven’t done so already, initialize the client before creating a collection:
import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CollectionVectorServiceOptions
# Initialize the client and get a "Database" object
client = DataAPIClient(os.environ["ASTRA_DB_APPLICATION_TOKEN"])
database = client.get_database(os.environ["ASTRA_DB_API_ENDPOINT"])
print(f"* Database: {database.info().name}\n")
Create a collection integrated with NVIDIA:
collection = database.create_collection(
"COLLECTION_NAME",
metric=VectorMetric.COSINE,
service=CollectionVectorServiceOptions(
provider="nvidia",
model_name="NV-Embed-QA",
),
)
print(f"* Collection: {collection.full_name}\n")
Use the TypeScript client to create a collection that uses the NVIDIA integration.
Initialize the client
If you haven’t done so already, initialize the client before creating a collection:
import { DataAPIClient, VectorDoc, UUID } from '@datastax/astra-db-ts';
const { ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT } = process.env;
// Initialize the client and get a 'Db' object
const client = new DataAPIClient(ASTRA_DB_APPLICATION_TOKEN);
const db = client.db(ASTRA_DB_API_ENDPOINT);
console.log(`* Connected to DB ${db.id}`);
Create a collection integrated with NVIDIA:
(async function () {
const collection = await db.createCollection('COLLECTION_NAME', {
vector: {
service: {
provider: 'nvidia',
modelName: 'NV-Embed-QA',
},
},
});
console.log(`* Created collection ${collection.keyspace}.${collection.collectionName}`);
Use the Java client to create a collection that uses the NVIDIA integration.
Initialize the client
If you haven’t done so already, initialize the client before creating a collection:
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.CollectionOptions;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindIterable;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.SimilarityMetric;
import static com.datastax.astra.client.model.SimilarityMetric.COSINE;
public class Quickstart {
public static void main(String[] args) {
// Loading Arguments
String astraToken = System.getenv("ASTRA_DB_APPLICATION_TOKEN");
String astraApiEndpoint = System.getenv("ASTRA_DB_API_ENDPOINT");
// Initialize the client
DataAPIClient client = new DataAPIClient(astraToken);
System.out.println("Connected to AstraDB");
Database db = client.getDatabase(astraApiEndpoint);
System.out.println("Connected to Database.");
Create a collection integrated with NVIDIA:
CollectionOptions.CollectionOptionsBuilder builder = CollectionOptions
.builder()
.vectorSimilarity(SimilarityMetric.COSINE)
.vectorize("nvidia", "NV-Embed-QA");
Collection<Document> collection = db
.createCollection("COLLECTION_NAME", builder.build());
Use the Data API to create a collection that uses the NVIDIA integration:
curl -sS --location -X POST "$ASTRA_DB_API_ENDPOINT/api/json/v1/default_keyspace" \
--header "Token: $ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"createCollection": {
"name": "COLLECTION_NAME",
"options": {
"vector": {
"metric": "cosine",
"service": {
"provider": "nvidia",
"modelName": "NV-Embed-QA"
}
}
}
}
}' | jq
If you get a Serverless (Vector) databases created after June 24, 2024 can have up to 10 collections. Databases created before this date can have up to 5 collections. The collection limit is based on Storage Attached Indexing (SAI). |
After you create a collection, load data into the collection.
Load and search data with vectorize
-
Load vector data into your vectorize-integrated collection.
When you load structured JSON or CSV data, the Vector Field specifies field to use to generate embeddings with
$vectorize
. -
After loading data, you can perform a similarity search using text, rather than a vector.