Integrate NVIDIA as an embedding provider
Integrate the Astra-hosted NVIDIA embedding provider to enable Astra DB vectorize.
Prerequisites
To use NVIDIA as an embedding provider, you need the following:
-
An active Astra account with permission to create collections.
-
A Serverless (Vector) database in AWS
us-east-2
or GCPus-east1
.This integration is available only for databases in AWS
us-east-2
or GCPus-east1
.If this is your first time using Astra DB, follow the Quickstart to create a database and connect to it with an API client.
Add the NVIDIA integration to a new collection
Before you can use the NVIDIA integration to generate embeddings, you must add the integration to a new collection.
You can’t change a collection’s embedding provider or embedding generation method after you create it. To use a different embedding provider, you must create a new collection with a different embedding provider integration. |
-
Astra Portal
-
Python
-
TypeScript
-
Java
-
curl
-
In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.
-
Click Data Explorer.
-
In the Keyspace field, select the keyspace where you want to create the collection or use
default_keyspace
. -
Click Create Collection.
-
In the Create collection dialog, enter a name for the collection. Collection names can contain no more than 50 characters, including letters, numbers, and underscores.
-
Turn on Vector-enabled collection.
-
For Embedding generation method, select the NVIDIA embedding provider integration.
-
Complete the following fields:
-
Embedding model: The model that you want to use to generate embeddings. If only one model is available, it is selected by default.
-
Dimensions: The number of dimensions that you want the generated vectors to have. Typically, the number of dimensions is automatically determined by the model you select.
-
Similarity metric: The method you want to use to calculate vector similarity scores. The available metrics are Cosine, Dot Product, and Euclidean.
-
-
Click Create collection.
Use the Python client to create a collection that uses the NVIDIA integration.
Initialize the client
If you haven’t done so already, initialize the client before creating a collection:
import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CollectionVectorServiceOptions
# Initialize the client and get a "Database" object
client = DataAPIClient(os.environ["ASTRA_DB_APPLICATION_TOKEN"])
database = client.get_database(os.environ["ASTRA_DB_API_ENDPOINT"])
print(f"* Database: {database.info().name}\n")
Create a collection integrated with NVIDIA:
collection = database.create_collection(
"COLLECTION_NAME",
metric=VectorMetric.SIMILARITY_METRIC,
service=CollectionVectorServiceOptions(
provider="nvidia",
model_name="NV-Embed-QA",
),
)
print(f"* Collection: {collection.full_name}\n")
Use the TypeScript client to create a collection that uses the NVIDIA integration.
Initialize the client
If you haven’t done so already, initialize the client before creating a collection:
import { DataAPIClient, VectorDoc, UUID } from '@datastax/astra-db-ts';
const { ASTRA_DB_APPLICATION_TOKEN, ASTRA_DB_API_ENDPOINT } = process.env;
// Initialize the client and get a 'Db' object
const client = new DataAPIClient(ASTRA_DB_APPLICATION_TOKEN);
const db = client.db(ASTRA_DB_API_ENDPOINT);
console.log(`* Connected to DB ${db.id}`);
Create a collection integrated with NVIDIA:
(async function () {
const collection = await db.createCollection('COLLECTION_NAME', {
vector: {
service: {
provider: 'nvidia',
modelName: 'NV-Embed-QA',
},
},
});
console.log(`* Created collection ${collection.keyspace}.${collection.collectionName}`);
Use the Java client to create a collection that uses the NVIDIA integration.
Initialize the client
If you haven’t done so already, initialize the client before creating a collection:
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.Database;
import com.datastax.astra.client.model.CollectionOptions;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindIterable;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.SimilarityMetric;
// Replace SIMILARITY_METRIC with your desired similarity metric:
// COSINE, DOT_PRODUCT, EUCLIDIAN
import static com.datastax.astra.client.model.SimilarityMetric.SIMILARITY_METRIC;
public class Quickstart {
public static void main(String[] args) {
// Loading Arguments
String astraToken = System.getenv("ASTRA_DB_APPLICATION_TOKEN");
String astraApiEndpoint = System.getenv("ASTRA_DB_API_ENDPOINT");
// Initialize the client
DataAPIClient client = new DataAPIClient(astraToken);
System.out.println("Connected to AstraDB");
Database db = client.getDatabase(astraApiEndpoint);
System.out.println("Connected to Database.");
Create a collection integrated with NVIDIA:
CollectionOptions.CollectionOptionsBuilder builder = CollectionOptions
.builder()
.vectorSimilarity(SimilarityMetric.SIMILARITY_METRIC)
.vectorize("nvidia", "NV-Embed-QA");
Collection<Document> collection = db
.createCollection("COLLECTION_NAME", builder.build());
Use the Data API to create a collection that uses the NVIDIA integration:
curl -sS -L -X POST "$ASTRA_DB_API_ENDPOINT/api/json/v1/default_keyspace" \
--header "Token: $ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"createCollection": {
"name": "COLLECTION_NAME",
"options": {
"vector": {
"metric": "cosine",
"service": {
"provider": "nvidia",
"modelName": "NV-Embed-QA"
}
}
}
}
}' | jq
If you get a Serverless (Vector) databases created after June 24, 2024 can have approximately 10 collections. Databases created before this date can have approximately 5 collections. The collection limit is based on the number of indexes. |
After you create a collection, load data into the collection.
Load and search data with vectorize
-
Load vector data into your vectorize-integrated collection.
When you load structured JSON or CSV data, the Vector Field specifies field to use to generate embeddings with
$vectorize
. -
After loading data, you can perform a similarity search using text, rather than a vector.
Troubleshoot vectorize
When working with vectorize, including the $vectorize
reserved field in the Data API, errors can occur from two sources:
- Astra DB
-
There is an issue within Astra DB, including the Astra DB platform, the Data API server, Data API clients, or something else.
Some of the most common Astra DB vectorize errors are related to scoped databases. In your vectorize integration settings, make sure your database is in the scope of the credential that you want to use. Scoped database errors don’t apply to the NVIDIA Astra-hosted embedding provider integration.
When using the Data API with collections, make sure you don’t use
$vector
and$vectorize
in the same query. For more information, see the Data API collections references, such as Vector and vectorize, Insert many documents, and Sort clauses for documents.When using the Data API with tables, you can only run a vector search on one
vector
column at a time. To generate an embedding from a string, the targetvector
column must have a defined embedding provider integration. For more information, see the Data API tables references, such as Vector type and Sort clauses for rows. - The embedding provider
-
The embedding provider encountered an issue while processing the embedding generation request. Astra DB passes these errors to you through the Astra Portal or Data API with a qualifying statement such as
The embedding provider returned a HTTP client error
.Possible embedding provider errors include rate limiting, billing or account funding issues, chunk or token size limits, and so on. For more information about these errors, see the embedding provider’s documentation, including the documentation for your chosen model.
Carefully read all error messages to determine the source and possible cause for the issue.
NVIDIA token limit
The model for the NVIDIA integration has a token limit of 512.
When loading or querying data in collections that use this integration, your $vectorize
strings must not exceed this limit.