Integrate LlamaIndex with Astra DB Serverless

query_builder 15 min

LlamaIndex can use Astra DB Serverless to store and retrieve vectors for ML applications.

Prerequisites

This guide requires the following:

Connect to your Serverless (Vector) database

  1. Import libraries and connect to the database.

    • Local installation

    • Google Colab

    Create a .env file in the folder where you will create your Python script. Populate the file with the Astra DB application token and endpoint values from the Database Details section of your database’s Overview tab, and your OpenAI API key.

    + ..env

    ASTRA_DB_APPLICATION_TOKEN="TOKEN"
    ASTRA_DB_API_ENDPOINT="API_ENDPOINT"
    OPENAI_API_KEY="API_KEY"
    import os
    from getpass import getpass
    os.environ["ASTRA_DB_APPLICATION_TOKEN"] = getpass("ASTRA_DB_APPLICATION_TOKEN = ")
    os.environ["ASTRA_DB_API_ENDPOINT"] = input("ASTRA_DB_API_ENDPOINT = ")
    os.environ["OPENAI_API_KEY"] = getpass("OPENAI_API_KEY = ")

    The endpoint format is https://ASTRA_DB_ID-ASTRA_DB_REGION.apps.astra.datastax.com.

  2. Import your dependencies.

    • Local installation

    • Google Colab

    integrate.py
    import os
    
    from llama_index.vector_stores.astra_db import AstraDBVectorStore
    from llama_index.core import VectorStoreIndex, StorageContext
    from llama_index.core.llama_dataset import download_llama_dataset
    from dotenv import load_dotenv
    import os
    
    from llama_index.vector_stores.astra_db import AstraDBVectorStore
    from llama_index.core import VectorStoreIndex, StorageContext
    from llama_index.core.llama_dataset import download_llama_dataset
  3. Load your environment variables. To avoid a namespace collision, don’t name the file llamaindex.py.

    • Local installation

    • Google Colab

    load_dotenv()
    
    ASTRA_DB_APPLICATION_TOKEN = os.environ.get("ASTRA_DB_APPLICATION_TOKEN")
    ASTRA_DB_API_ENDPOINT = os.environ.get("ASTRA_DB_API_ENDPOINT")
    OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
    ASTRA_DB_APPLICATION_TOKEN = os.environ.get("ASTRA_DB_APPLICATION_TOKEN")
    ASTRA_DB_API_ENDPOINT = os.environ.get("ASTRA_DB_API_ENDPOINT")
    OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")

    If you’re using Microsoft Azure OpenAI, include these additional environment variables:

    OPENAI_API_TYPE="azure"
    OPENAI_API_VERSION="2023-05-15"
    OPENAI_API_BASE="https://RESOURCE_NAME.openai.azure.com"
    OPENAI_API_KEY="API_KEY"

Create embeddings from text

  1. Download a sample dataset from LlamaHub and load it as a Document object.

    integrate.py
    rag_dataset, documents = download_llama_dataset(
      "PaulGrahamEssayDataset", "./data"
    )
    
    print(f"Number of loaded documents: {len(documents)}")
    print(f"First document, id: {documents[0].doc_id}")
    print(f"First document, hash: {documents[0].hash}")
    print(
        "First document, text"
        f" ({len(documents[0].text)} characters):\n"
        f"{'=' * 20}\n"
        f"{documents[0].text[:360]} ..."
    )
  2. Optional: Chunk the documents using the default splitter. This is optional because the documents are split automatically when ingested by the vector store.

    integrate.py
    # This step is optional because splitting happens automatically during ingestion
    from llama_index.core.node_parser import SentenceSplitter
    default_splitter = SentenceSplitter()
    split_nodes = default_splitter(documents)
    print(f"Number of split nodes: {len(split_nodes)}")
    print(f"Third split node, document reference ID: {split_nodes[2].ref_doc_id}")
    print(f"Third split node, node ID: {split_nodes[2].node_id}")
    print(f"Third split node, hash: {split_nodes[2].hash}")
    print(
        "Third split node, text"
        f" ({len(split_nodes[2].text)} characters):\n"
        f"{'=' * 20}\n"
        f"{split_nodes[2].text[:360]} ..."
    )
  3. Create an Astra DB vector store.

    integrate.py
    astra_db_store = AstraDBVectorStore(
        token=ASTRA_DB_APPLICATION_TOKEN,
        api_endpoint=ASTRA_DB_API_ENDPOINT,
        collection_name="llama_index_rag_test",
        embedding_dimension=1536,
    )
  4. Build the index for your documents. The StorageContext.from_defaults method tells LlamaIndex to use the AstraDBVectorStore you created. The from_documents method splits your Documents into Nodes and creates embeddings from the text of every Node.

    integrate.py
    storage_context = StorageContext.from_defaults(vector_store=astra_db_store)
    
    index = VectorStoreIndex.from_documents(
        documents=documents, storage_context=storage_context
    )

Verify the integration

  1. Ask a question about the stored text and verify the response is relevant.

    integrate.py
    query_engine = index.as_query_engine()
    query_string_1 = "Why did the author choose to work on AI?"
    response = query_engine.query(query_string_1)
    
    print("\n\n" + query_string_1)
    print(response.response)
  2. Add an additional query using Max Marginal Relevance (MMR). MMR selects the Nodes that are relevant to the query while also selecting the most different from each other. The query results are printed with scores, so you can see where the relevant Nodes rank and where the LLM’s results came from.

    integrate.py
    retriever = index.as_retriever(
        vector_store_query_mode="mmr",
        similarity_top_k=3,
        vector_store_kwargs={"mmr_prefetch_factor": 4},
    )
    
    query_string_2 = "Why did the author choose to work on AI?"
    nodes_with_scores = retriever.retrieve(query_string_2)
    
    print("\n\n" + query_string_2 + " (question asked with MMR)")
    print(f"Found {len(nodes_with_scores)} nodes.")
    for idx, node_with_score in enumerate(nodes_with_scores):
        print(f"    [{idx}] score = {node_with_score.score}")
        print(f"        id    = {node_with_score.node.node_id}")
        print(f"        text  = {node_with_score.node.text[:90]} ...")

Run the code

Run the code.

python integrate.py

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com