RAG with LlamaIndex and Astra DB Serverless

Build a RAG pipeline with RAGStack, Astra DB Serverless, and LlamaIndex.

Prerequisites

You will need an vector-enabled Astra DB Serverless database.

Install the following dependencies:

pip install ragstack-ai python-dotenv

See the Prerequisites page for more details.

Set up your local environment

Create a .env file in your application directory with the following environment variables:

ASTRA_DB_APPLICATION_TOKEN=AstraCS: ...
ASTRA_DB_API_ENDPOINT=https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
OPENAI_API_KEY=sk-...

If you’re using Google Colab, you’ll be prompted for these values in the Colab environment.

See the Prerequisites page for more details.

Create a RAG pipeline with LlamaIndex

  1. Import dependencies and load environment variables.

    import os
    from dotenv import load_dotenv
    from llama_index.core.llama_dataset import download_llama_dataset
    from llama_index.vector_stores.astra_db import AstraDBVectorStore
    from llama_index.core import (
        VectorStoreIndex,
        SimpleDirectoryReader,
        StorageContext,
    )
    
    load_dotenv()
  2. The dataset will be downloaded to the /data directory.

    dataset = download_llama_dataset(
      "PaulGrahamEssayDataset", "./data"
    )
  3. Create the vector store, populate the vector store with the dataset, and create the index.

    documents = SimpleDirectoryReader("./data/source_files").load_data()
    print(f"Total documents: {len(documents)}")
    print(f"First document, id: {documents[0].doc_id}")
    print(f"First document, hash: {documents[0].hash}")
    print(
        "First document, text"
        f" ({len(documents[0].text)} characters):\n"
        f"{'=' * 20}\n"
        f"{documents[0].text[:360]} ..."
    )
    
    # Create a vector store instance
    astra_db_store = AstraDBVectorStore(
        token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"),
        api_endpoint=os.getenv("ASTRA_DB_API_ENDPOINT"),
        collection_name="test",
        embedding_dimension=1536,
    )
    
    # Create a default storage context for the vector store
    storage_context = StorageContext.from_defaults(vector_store=astra_db_store)
    
    # Create a vector index from your documents
    index = VectorStoreIndex.from_documents(
        documents, storage_context=storage_context
    )
  4. Query the vector store index for the most relevant answer to your prompt, "Why did the author choose to work on AI?"

    # single query for most relevant result
    query_engine = index.as_query_engine()
    query_string_1 = "Why did the author choose to work on AI?"
    response = query_engine.query(query_string_1)
    
    print(query_string_1)
    print(response.response)
  5. Retrieve results from your vector store index based on your prompt. This will retrieve three nodes with their relevance scores.

    # similarity search with scores
    retriever = index.as_retriever(
        vector_store_query_mode="default",
        similarity_top_k=3,
    )
    
    nodes_with_scores = retriever.retrieve(query_string_1)
    
    print(query_string_1)
    print(f"Found {len(nodes_with_scores)} nodes.")
    for idx, node_with_score in enumerate(nodes_with_scores):
        print(f"    [{idx}] score = {node_with_score.score}")
        print(f"        id    = {node_with_score.node.node_id}")
        print(f"        text  = {node_with_score.node.text[:90]} ...")
  6. Set the retriever to sort results by Maximal Marginal Relevance, or MMR, instead of the default similarity search.

    # MMR
    retriever = index.as_retriever(
        vector_store_query_mode="mmr",
        similarity_top_k=3,
        vector_store_kwargs={"mmr_prefetch_factor": 4},
    )
    
    nodes_with_scores = retriever.retrieve(query_string_1)
    
    print(query_string_1)
    print(f"Found {len(nodes_with_scores)} nodes.")
    for idx, node_with_score in enumerate(nodes_with_scores):
        print(f"    [{idx}] score = {node_with_score.score}")
        print(f"        id    = {node_with_score.node.node_id}")
        print(f"        text  = {node_with_score.node.text[:90]} ...")
  7. Send the prompt again. The top result is the most relevant (positive number), while the other results are the least relevant (negative numbers).

Cleanup

Be a good digital citizen and clean up after yourself.

To clear data from your vector database but keep the collection, use the vstore.clear() method.

To delete the collection from your vector database, use the vstore.delete_collection() method. Alternatively, you can use the Data API to delete the collection:

curl -v -s --location \
--request POST https://${ASTRA_DB_ID}-${ASTRA_DB_REGION}.apps.astra.datastax.com/api/json/v1/default_keyspace \
--header "X-Cassandra-Token: $ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
  "deleteCollection": {
    "name": "test"
  }
}'

Complete code

Python
import os
from dotenv import load_dotenv
from llama_index.core.llama_dataset import download_llama_dataset
from llama_index.vector_stores.astra_db import AstraDBVectorStore
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
)

load_dotenv()

dataset = download_llama_dataset(
  "PaulGrahamEssayDataset", "./data"
)

# Load the documents from the dataset into memory
documents = SimpleDirectoryReader("./data/source_files").load_data()
print(f"Total documents: {len(documents)}")
print(f"First document, id: {documents[0].doc_id}")
print(f"First document, hash: {documents[0].hash}")
print(
    "First document, text"
    f" ({len(documents[0].text)} characters):\n"
    f"{'=' * 20}\n"
    f"{documents[0].text[:360]} ..."
)

# Create a vector store instance
astra_db_store = AstraDBVectorStore(
    token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"),
    api_endpoint=os.getenv("ASTRA_DB_API_ENDPOINT"),
    collection_name="test",
    embedding_dimension=1536,
)

# Create a default storage context for the vector store
storage_context = StorageContext.from_defaults(vector_store=astra_db_store)

# Create a vector index from your documents
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

query_engine = index.as_query_engine()
query_string_1 = "Why did the author choose to work on AI?"
response = query_engine.query(query_string_1)

print(query_string_1)
print(response.response)

retriever = index.as_retriever(
    vector_store_query_mode="mmr",
    similarity_top_k=3,
    vector_store_kwargs={"mmr_prefetch_factor": 4},
)

nodes_with_scores = retriever.retrieve(query_string_1)

print(query_string_1)
print(f"Found {len(nodes_with_scores)} nodes.")
for idx, node_with_score in enumerate(nodes_with_scores):
    print(f"    [{idx}] score = {node_with_score.score}")
    print(f"        id    = {node_with_score.node.node_id}")
    print(f"        text  = {node_with_score.node.text[:90]} ...")

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com