Knowledge Base Search on Proprietary Data powered by Astra DB Serverless
This notebook guides you through setting up RAGStack using Astra DB Serverless Search, OpenAI, and CassIO to implement a generative Q&A over your own documentation.
ChatGPT excels at answering questions and offers a nice dialog interface to ask questions and get answers, but it only knows about topics from its training data.
What do you do when you have your own documents? How can you leverage GenAI and LLM models to get insights into those? You can use Retrieval-Augmented Generation (RAG) to create a Q/A Bot to answer specific questions over your documentation.
You can create this in two steps:
+ . Analyze and store existing documentation. . Provide search capabilities for the LLM model to retrieve your documentation.
+ Ideally, you embed the data as vectors and store them in a vector database, then use the LLM models on top of that database.
This notebook demonstrates a basic two-step RAG technique for enabling GPT to answer questions using a library of reference on your own documentation using Astra DB Serverless Search.
Get started with this notebook
-
Install the following libraries.
pip install \ "ragstack-ai" \ "openai" \ "pypdf" \ "python-dotenv"
-
Import dependencies.
import os from dotenv import load_dotenv from langchain_openai import OpenAIEmbeddings from langchain.vectorstores import Cassandra from langchain_community.document_loaders import TextLoader from langchain_community.document_loaders import PyPDFLoader from langchain.chat_models import ChatOpenAI from langchain.indexes.vectorstore import VectorStoreIndexWrapper from cassandra.cluster import Cluster from cassandra.auth import PlainTextAuthProvider
You will need a secure connect bundle and a user with access permission. For demo purposes, the "administrator" role will work fine. For more, see Prerequisites.
-
Initialize the environment variables.
ASTRA_DB_SECURE_BUNDLE_PATH = os.getenv("ASTRA_DB_SECURE_BUNDLE_PATH") ASTRA_DB_APPLICATION_TOKEN = os.getenv("ASTRA_DB_APPLICATION_TOKEN") ASTRA_DB_KEYSPACE = os.getenv("ASTRA_DB_NAMESPACE") ASTRA_DB_TABLE_NAME = os.getenv("ASTRA_DB_COLLECTION") OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
-
Retrieve the text of a short story that will be indexed in the vector store and set it as the sample data. This is a short story by Edgar Allen Poe called "The Cask of Amontillado".
curl https://raw.githubusercontent.com/CassioML/cassio-website/main/docs/frameworks/langchain/texts/amontillado.txt --output amontillado.txt SAMPLEDATA = ["amontillado.txt"]
-
Connect to Astra DB Serverless.
cluster = Cluster(cloud={"secure_connect_bundle": ASTRA_DB_SECURE_BUNDLE_PATH}, auth_provider=PlainTextAuthProvider("token", ASTRA_DB_APPLICATION_TOKEN)) session = cluster.connect()
Read files, create embeddings, and store in Astra DB Serverless
CassIO seamlessly integrates with RAGStack and LangChain, offering Cassandra-specific tools for many tasks. This example uses vector stores, indexers, embeddings, and queries, with OpenAI for LLM services.
-
Loop through each file and load it into the vector store.
documents = [] for filename in SAMPLEDATA: path = os.path.join(os.getcwd(), filename) if filename.endswith(".pdf"): loader = PyPDFLoader(path) new_docs = loader.load_and_split() print(f"Processed pdf file: {filename}") elif filename.endswith(".txt"): loader = TextLoader(path) new_docs = loader.load_and_split() print(f"Processed txt file: {filename}") else: print(f"Unsupported file type: {filename}") if len(new_docs) > 0: documents.extend(new_docs)
-
Initialize the vector store with the documents and the OpenAI embeddings.
cass_vstore = Cassandra.from_documents( documents=documents, embedding=OpenAIEmbeddings(), session=session, keyspace=ASTRA_DB_KEYSPACE, table_name=ASTRA_DB_TABLE_NAME, )
-
Empty the list of file names — we don’t want to accidentally load the same files again.
SAMPLEDATA = [] print(f"\nProcessing done.")
Query the vector store and execute some "searches" against it
-
Start with a similarity search using the Vectorstore’s implementation.
prompt = "Who is Luchesi?" matched_docs = cass_vstore.similarity_search(query=prompt, k=1) for i, d in enumerate(matched_docs): print(f"\n## Document {i}\n") print(d.page_content)
-
To implement Q/A over documents, you need to perform some additional steps. Create an Index on top of the vector store.
index = VectorStoreIndexWrapper(vectorstore=cass_vstore)
-
Create a retriever from the Index. A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever.
-
Query the index for relevant vectors to the prompt:
prompt = "Who is Luchesi?" index.query(question=prompt)
-
Alternatively, use a retrieval chain with a custom prompt:
from langchain.chains import RetrievalQA from langchain.llms import OpenAI from langchain.prompts import ChatPromptTemplate prompt= """ You are Marv, a sarcastic but factual chatbot. End every response with a joke related to the question. Context: {context} Question: {question} Your answer: """ prompt = ChatPromptTemplate.from_template(prompt) qa = RetrievalQA.from_chain_type(llm=OpenAI(), retriever=cass_vstore.as_retriever(), chain_type_kwargs={"prompt": prompt}) result = qa.run("{question: Who is Luchesi?") result