Store Embeddings
We recommend LangChain’s OpenAIEmbeddings class for storing your embeddings in a vector store.
We recommend DataStax Serverless (non-vector) to store your embeddings. Serverless (non-vector) integrates with LangChain as a vector store using the AstraPy client.
Prerequisites
You will need an vector-enabled Serverless (non-vector) database and an OpenAI Account.
See the Notebook Prerequisites page for more details.
-
Create an vector-enabled Serverless (non-vector) database.
-
Create an OpenAI account
-
Within your database, create an Astra DB keyspace
-
Within your database, create an Astra DB Access Token with Database Administrator permissions.
-
Get your Serverless (non-vector) API Endpoint: https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
-
Initialize the environment variables in a
.envfile.ASTRA_DB_APPLICATION_TOKEN=AstraCS:... ASTRA_DB_API_ENDPOINT=https://9d9b9999-999e-9999-9f9a-9b99999dg999-us-east-2.apps.astra.datastax.com ASTRA_DB_COLLECTION=test OPENAI_API_KEY=sk-f99... -
Enter your settings for Serverless (non-vector) and OpenAI:
astra_token = os.getenv("ASTRA_DB_APPLICATION_TOKEN") astra_endpoint = os.getenv("ASTRA_DB_API_ENDPOINT") collection = os.getenv("ASTRA_DB_COLLECTION") openai_api_key = os.getenv("OPENAI_API_KEY")
Store embeddings in the vector-enabled Serverless (non-vector) database
This code embeds the loaded Documents from the Split Documents example and stores the embeddings in the Serverless (non-vector) vector store.
import os
from dotenv import load_dotenv
from langchain_astradb import AstraDBVectorStore
from langchain_openai import OpenAIEmbeddings
load_dotenv()
ASTRA_DB_COLLECTION = os.environ.get("ASTRA_DB_COLLECTION")
embedding = OpenAIEmbeddings()
vstore = AstraDBVectorStore(
embedding=embedding,
collection_name="test",
token=os.environ["ASTRA_DB_APPLICATION_TOKEN"],
api_endpoint=os.environ["ASTRA_DB_API_ENDPOINT"],
)
docs = []
inserted_ids = vstore.add_documents(docs)
print(f"\nInserted {len(inserted_ids)} documents.")
print(vstore.astra_db.collection(ASTRA_DB_COLLECTION).find())