Build a Graph RAG system with LangChain and GraphVectorStore

query_builder 20 min

Graph RAG is an enhancement to retrieval-augmented generation (RAG) that retrieves elements from a knowledge graph to serve as grounding context for a large language model (LLM).

In this tutorial, you will build a simple graph RAG system. First, a knowledge graph is built from a small set of linked HTML documents. Then, the graph is used during the retrieval step of RAG to provide extended context to the LLM when generating a response.

For more information about this tutorial’s use case, dataset, and the value of Graph RAG, see Your Documents Are Trying to Tell You What’s Relevant: Better RAG Using Links.

Prerequisites

For this tutorial, you need the following:

Create your Serverless (Vector) database

  1. In the Astra Portal navigation menu, click Databases, and then click Create Database.

  2. Select the Serverless (Vector) deployment type.

  3. Enter a meaningful, human-readable Database name.

    After you create a database, you can’t change its name.

    Database names are permanent. They must start and end with a letter or number, and they can contain no more than 50 characters, including letters, numbers, and the special characters & + - _ ( ) < > . , @.

  4. Select a Provider and Region to host your database.

    On the Free plan, you can access a limited set of supported regions. To access Locked regions, you must upgrade your subscription plan.

    To minimize latency in production databases, select a region that is close to your application’s users.

  5. Click Create Database.

    New databases start in Pending status, and then move to Initializing. Your database is ready to use when it reaches Active status.

Install dependencies

Run the following commands in a Jupyter notebook or similar environment. For example, you can use this tutorial’s Colab notebook. To run these commands in a different environment, you must modify them accordingly.

  1. Install the project dependencies:

    %pip install langchain langchain-community langchain-openai beautifulsoup4 cassio
  2. Install the nest_asyncio package, if required for compatibility within Jupyter.

    %pip install nest_asyncio
    
    import nest_asyncio
    nest_asyncio.apply()

Set your environment variables

Run the following commands in a Jupyter notebook or similar environment. For example, you can use this tutorial’s Colab notebook. To run these commands in a different environment, you must modify them accordingly.

  1. Use the os Python package to set the following environment variables:

    import os
    
    os.environ['ASTRA_DB_API_ENDPOINT'] = 'ENDPOINT'
    os.environ['ASTRA_DB_APPLICATION_TOKEN'] = 'APPLICATION_TOKEN'
    os.environ['OPENAI_API_KEY'] = 'OPENAI_API_KEY'

    Replace the following:

    • ENDPOINT: Your database’s API endpoint.

      In the Astra Portal, select your database, and then locate the Database Details section. Copy the database’s API Endpoint, and then set it as the ASTRA_DB_API_ENDPOINT environment variable.

    • APPLICATION_TOKEN: An application token for your database.

      In the Astra Portal, select your database, and then locate the Database Details section. Click Generate Token, store the token securely, and then set it as the ASTRA_DB_APPLICATION_TOKEN environment variable.

    • OPENAI_API_KEY: Your OpenAI API key.

      In the OpenAI Platform, create an API key, store it securely, and then set it as the OPENAI_API_KEY environment variable.

Build a Graph RAG system on linked HTML documents

Run the following commands in a Jupyter notebook or similar environment. For example, you can use this tutorial’s Colab notebook. To run these commands in a different environment, you must modify them accordingly.

Build the knowledge graph

  1. Prepare the data for your knowledge graph.

    For this tutorial, the following code creates a small dataset of HTML documents with links to one another:

    HTML document dataset example
    # HTML document dataset
    html_doc_list = [
    {
        'url': 'https://en.wikipedia.org/wiki/Space_Needle',
        'html_doc': """
    <html><head><title>Space Needle</title></head>
    <body>
    <p class="title"><b>Space Needle</b></p>
    
    <p class="content">
    The Space Needle is an observation tower in <a href="https://en.wikipedia.org/wiki/Seattle" id="link_seattle">Seattle</a>, Washington, United States. Considered to be an icon of the city, it has been designated a Seattle landmark. Located in the <a href="https://en.wikipedia.org/wiki/Lower_Queen_Anne,_Seattle">Lower Queen Anne</a> neighborhood, it was built in the Seattle Center for the 1962 World's Fair, which drew over 2.3 million visitors.
    </p>
    
    <p class="content">
    At 605 ft (184 m) high the Space Needle was once the tallest structure west of the Mississippi River. The tower is 138 ft (42 m) wide, weighs 9,550 short tons (8,660 metric tons), and is built to withstand winds of up to 200 mph (320 km/h) and earthquakes of up to 9.0 magnitude, as strong as the 1700 Cascadia earthquake.
    </p>
    """
    },
    
    {
        'url': 'https://en.wikipedia.org/wiki/Lower_Queen_Anne,_Seattle',
        'html_doc':  """
    <html><head><title>Lower Queen Anne, Seattle</title></head>
    <body>
    <p class="title"><b>Lower Queen Anne, Seattle</b></p>
    
    <p class="content">
    Lower Queen Anne (officially known since 2021 as Uptown)[1] is a neighborhood in <a href="https://en.wikipedia.org/wiki/Seattle" id="link_seattle">Seattle</a>, Washington, at the base of Queen Anne Hill. While its boundaries are not precise, the toponym usually refers to the shopping, office, and residential districts to the north and west of Seattle Center. The neighborhood is connected to Upper Queen Anne—the shopping district at the top of the hill—by an extremely steep section of Queen Anne Avenue N. known as the Counterbalance, in memory of the cable cars that once ran up and down it.
    </p>
    
    <p class="content">
    While "Lower Queen Anne" and "Uptown" are rarely used to refer to the grounds of Seattle Center itself, most of Seattle Center is in the neighborhood; these include Climate Pledge Arena (home of the Seattle Storm of the WNBA and the Seattle Kraken of the NHL), the Exhibition Hall, McCaw Hall (home of the Seattle Opera and Pacific Northwest Ballet), the Cornish Playhouse (home of the Intiman Summer Theatre Festival and Cornish College of the Arts), the Bagley Wright Theater (home of Seattle Repertory Theatre), and the studios for KEXP radio. Lower Queen Anne also has a three-screen movie theater, the SIFF Cinema Uptown,[2] and On the Boards, a center for avant-garde theater and music.
    </p>
    """
    },
    
    # Demo documents that are not very informative, but they are retrieved by the vector store in illustrative examples
    {
        'url': 'https://TheSpaceNeedleisGreat',
        'html_doc': """
    <html><head><title>The Space Needle is Great.</title></head>
    <body><p class="title"><b>The Space Needle is Great.</b></p>
    <p class="content">The Space Needle is Great.</p>
    """
    },
    {
        'url': 'https://TheSpaceNeedleisTALL',
        'html_doc': """
    <html><head><title>The Space Needle is TALL.</title></head>
    <body><p class="title"><b>The Space Needle is TALL.</b></p>
    <p class="content">The Space Needle is TALL.</p>
    """
    },
    {
        'url': 'https://SeattleIsOutWest',
        'html_doc': """
    <html><head><title>Seattle is Out West</title></head>
    <body><p class="title"><b>Seattle is Out West</b></p>
    <p class="content">Seattle is Out West</p>
    """
    },
    {
        'url': 'https://QueenAnneWasAPerson',
        'html_doc': """
    <html><head><title>Queen Anne Was a Person</title></head>
    <body><p class="title"><b></b></p>
    <p class="content">Queen Anne Was a Person</p>
    """
    },
    ]
  2. Build a knowledge graph from the data.

    The following code uses BeautifulSoup and HtmlLinkExtractor to parse and process the HTML documents into a knowledge graph. Documents that have links to each other are also connected in the graph.

    from bs4 import BeautifulSoup
    from pprint import pprint
    
    from langchain_core.documents import Document
    from langchain_community.graph_vectorstores.links import add_links
    from langchain_community.graph_vectorstores.extractors.html_link_extractor import HtmlInput, HtmlLinkExtractor
    
    
    def process_html_doc(html_doc, url):
        soup_doc = BeautifulSoup(html_doc, 'html.parser')
        doc = Document(
            page_content=soup_doc.get_text(),
            metadata={"source": url}
        )
        html_link_extractor = HtmlLinkExtractor()
        add_links(doc, html_link_extractor.extract_one(HtmlInput(soup_doc, url)))
        return doc
    
    # The processed documents
    docs = [process_html_doc(x['html_doc'], x['url'])
            for x in html_doc_list]
  3. Set up a GraphVectorStore, and then add documents:

    from langchain_openai import OpenAIEmbeddings
    from langchain_community.graph_vectorstores.cassandra import CassandraGraphVectorStore
    import cassio
    
    # Initialize AstraDB / Cassandra connections.
    cassio.init(auto=True)
    
    # Create a GraphVectorStore, combining Vector nodes and Graph edges
    gvstore = CassandraGraphVectorStore(
        embedding=OpenAIEmbeddings(),
        table_name='graph_rag_tutorial_store',
        keyspace='default_keyspace',  # Change this value if you are using a different keyspace.
    )
    
    # Add documents to the GVS
    doc_ids = gvstore.add_documents(docs)
  4. Set up the OpenAI model and prompt:

    from langchain_openai import ChatOpenAI
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.runnables import RunnablePassthrough
    
    llm = ChatOpenAI(model="gpt-4o")
    
    template = """Answer the question based only on the following context:
    
    {context}
    
    Question: {question}
    """
    prompt = ChatPromptTemplate.from_template(template)

Use the knowledge graph for retrieval

Run the following commands in a Jupyter notebook or similar environment. For example, you can use this tutorial’s Colab notebook. To run these commands in a different environment, you must modify them accordingly.

  1. Configure the retriever based on the GraphVectorStore, and then use it for retrieval:

    # Set up the retriever based on the GraphVectorStore
    retriever = gvstore.as_retriever(
        search_kwargs={
            "depth": 0,
            "k": 8
        }
    )

    The depth parameter defines the number of steps the retriever traverses the graph from each document in the initial set. A value of 0 means there is no graph traversal.

    The k parameter defines the number of documents retrieved by vector search in the initial retrieval step. Documents retrieved from graph traversal are added to this set.

  2. Send a prompt to the retriever:

    QUESTION = "What is close to the Space Needle?"
    # QUESTION = "What is in the Lower Queen Anne neighborhood?"
    # QUESTION = "What is in the same neighborhood as the Space Needle?"
    
    results = retriever.invoke(QUESTION)
    
    pprint([x.metadata['source']
           for x in results])
    Example result
    ['https://TheSpaceNeedleisGreat',
     'https://TheSpaceNeedleisTALL',
     'https://en.wikipedia.org/wiki/Space_Needle',
     'https://en.wikipedia.org/wiki/Lower_Queen_Anne,_Seattle']

Use the knowledge graph for retrieval and generation (end-to-end RAG)

The preceding example demonstrated retrieval only. The following example shows how to do both retrieval and generation with an LLM.

Run the following commands in a Jupyter notebook or similar environment. For example, you can use this tutorial’s Colab notebook. To run these commands in a different environment, you must modify them accordingly.

  1. Set up the chain, as in LangChain, for end-to-end RAG:

    retriever = gvstore.as_retriever(
        search_kwargs={
            "depth": 0,  # depth of graph traversal; 0 is no traversal at all
            "k": 3
        }
    )
    
    # helper function for formatting
    def format_docs(docs):
        return "\n\n".join([d.page_content for d in docs])
    
    chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
    )
  2. Send a prompt to the chain, and then receive a response:

    QUESTION = "What is close to the Space Needle?"
    # # QUESTION = "What is in the Lower Queen Anne neighborhood?"
    # # QUESTION = "What is in the same neighborhood as the Space Needle?"
    
    response = chain.invoke(QUESTION)
    pprint(response)
    Example result
    'The Space Needle is located in the Lower Queen Anne neighborhood.'

Use deconstructed retrieval and generation

With a deconstructed RAG process, you build and run the chain so that, afterwards, you can inspect the intermediate retrieval results that were the inputs for the generation step.

Run the following commands in a Jupyter notebook or similar environment. For example, you can use this tutorial’s Colab notebook. To run these commands in a different environment, you must modify them accordingly.

QUESTION = "What is close to the Space Needle?"
# QUESTION = "What is in the Lower Queen Anne neighborhood?"
# QUESTION = "What is in the same neighborhood as the Space Needle?"

retriever = gvstore.as_retriever(
    search_kwargs={
        "depth": 1,  # depth of graph traversal; 0 is no traversal at all
        "k": 3       # number of docs returned by initial vector search---not including graph Links
    }
)

results = retriever.invoke(QUESTION)
input = {"context": format_docs(results),   # uses retrieved `results`
         "question": QUESTION}
chain = (
    prompt
    | llm
    | StrOutputParser()
)
response = chain.invoke(input)    # invoke the chain starting with `results`

# output
print('Question:\n', QUESTION, '\n')
print('Retrieved documents:')
pprint([x.metadata['source']
       for x in results])
print('\nLLM response:')
pprint(response)
Example result
Question:
 What is close to the Space Needle?

Retrieved documents:
['https://TheSpaceNeedleisGreat',
 'https://TheSpaceNeedleisTALL',
 'https://en.wikipedia.org/wiki/Space_Needle',
 'https://en.wikipedia.org/wiki/Lower_Queen_Anne,_Seattle']

LLM response:
('The Space Needle is located in the Lower Queen Anne neighborhood, which is '
 'also home to various attractions including Seattle Center, Climate Pledge '
 'Arena, Exhibition Hall, McCaw Hall, Cornish Playhouse, Bagley Wright '
 'Theater, and the studios for KEXP radio.')

Cleanup

After completing the tutorial, you can erase the tutorial data from your Astra DB account:

  • You can delete the entire database.

  • You can delete the tutorial table.

    For example, you can run the following code in your notebook environment to delete the graph_rag_tutorial_store table from the default_keyspace in your Astra DB database:

    # Specify the keyspace and table name
    keyspace = 'default_keyspace'
    table_name = 'graph_rag_tutorial_store'
    
    # Execute the TRUNCATE command to delete the specified table
    session = cassio.config.resolve_session()
    session.execute(f"TRUNCATE {keyspace}.{table_name}")
    
    print(f"All data in table '{table_name}' in keyspace '{keyspace}' has been dropped.")

Next steps

For more information on building or extending this graph RAG system, see Graph Vector Store in the LangChain documentation.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com