Integrate NVIDIA as an embedding provider
|
For a guided tutorial using this integration, see the quickstart for collections or the quickstart for tables. |
Astra vectorize integrations can automatically generate embeddings for data in collections and tables in Serverless (vector) databases. For more information about how vectorize works, see Manage embedding provider integrations for vectorize.
This guide explains how to use the Astra-hosted NVIDIA integration.
Create a qualifying database
|
This integration is available only for Serverless (vector) databases deployed to AWS |
Because this integration is managed by Astra, it is automatically available to qualifying databases.
Specifically, you need a Serverless (vector) database deployed to AWS us-east-2 or Google Cloud us-east1.
If you don’t already have a Serverless (vector) database deployed to a supported region, you must create a database to access this integration.
Add the integration to a collection
To use the NVIDIA integration to generate embeddings for data in a collection, you must select the integration when you create the collection.
|
You cannot add a vectorize integration to an existing collection. You cannot change a collection’s embedding provider or embedding generation method after you create it. To use a different embedding provider, you must create a new collection with a different embedding provider integration. If you get a Serverless (vector) databases created after June 24, 2024 can have approximately 10 collections. Databases created before this date can have approximately 5 collections. The collection limit is based on the number of indexes. |
You can create a collection in the Astra Portal or with the Data API.
Use the Astra Portal
-
In the Astra Portal, click the name of the Serverless (vector) database where you want to use the integration.
-
Click Data Explorer.
-
In the Keyspace field, select the keyspace where you want to create the collection.
-
Click Create Collection.
-
Enter a name for the collection.
For collection name rules and more information about creating collections, see Manage collections and tables.
-
Make sure Vector-enabled collection is enabled.
-
For Embedding generation method, select the NVIDIA embedding provider integration if it isn’t selected automatically.
If this option isn’t available, see Create a qualifying database.
-
For Embedding model, use the default model (
nvidia/nv-embedqa-e5-v5). The Dimensions are set automatically based on the selected model. -
For Similarity metric, select the method to use to calculate vector similarity scores: Cosine, Dot Product, or Euclidean.
-
Click Create collection.
To learn how to generate embeddings and perform vector searches on your integrated collection, see Next steps.
Use the Data API
You can use the Data API to create a collection that uses the NVIDIA integration.
The following example uses curl. For Data API client examples and more information, see the Data API reference documentation: Create a collection
curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
--header "Token: $APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"createCollection": {
"name": "COLLECTION_NAME",
"options": {
"vector": {
"metric": "cosine",
"service": {
"provider": "nvidia",
"modelName": "nvidia/nv-embedqa-e5-v5"
}
}
}
}
}'
To learn how to generate embeddings and perform vector searches on your integrated collection, see Next steps.
Add the integration to a table
You can use the Data API to add the NVIDIA integration to a table in multiple ways:
-
Create a table that has a vector column with a vectorize integration.
-
Alter a table to add a vector column with a vectorize integration.
-
Alter a table to add or change a vectorize integration on an existing vector column.
The following example uses curl to create a table with a vector column that has a vectorize integration. For Data API client examples and more information, see the Data API reference documentation:
-
Create a table with a vector column that uses the NVIDIA integration:
curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \ --header "Token: $APPLICATION_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "createTable": { "name": "TABLE_NAME", "definition": { "columns": { # This column will store vector embeddings. # The NVIDIA integration # will automatically generate vector embeddings # for any text inserted to this column. "VECTOR_COLUMN_NAME": { "type": "vector", "service": { "provider": "nvidia", "modelName": "nvidia/nv-embedqa-e5-v5" } }, # If you want to store the original text # in addition to the generated embeddings # you must create a separate column. "TEXT_COLUMN_NAME": "text" }, # You should change the primary key definition to meet the needs of your data. "primaryKey": "TEXT_COLUMN_NAME" } } }'The same parameters are used to configure a vectorize integration when you alter a table.
-
Index the vector column so that you can perform a vector search on it:
curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \ --header "Token: $APPLICATION_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "createVectorIndex": { "name": "INDEX_NAME", "definition": { "column": "VECTOR_COLUMN_NAME", "options": { "metric": "cosine", "sourceModel": "nvidia/nv-embedqa-e5-v5" } } } }'
To learn how to generate embeddings and perform vector searches on your integrated vector column, see Next steps.
NV-Embed-QA model token limit
The NV-Embed-QA model has a token limit of 512.
When you insert or search data in collections or tables that use this model, your $vectorize strings must not exceed this limit.
Migrate from NV-Embed-QA to nvidia/nv-embedqa-e5-v5
The NVIDIA integration supports the following models: nvidia/nv-embedqa-e5-v5, NV-Embed-QA.
nvidia/nv-embedqa-e5-v5 is the default option when creating collections with the NVIDIA embedding provider integration.
If your collections or tables use the NV-Embed-QA model, DataStax encourages you to migrate to the newer model.
For examples, see Migrate to a new embedding model for a collection and Migrate to a new embedding model for a table.