Build a Hotel Search Application with RAGStack and Astra DB Serverless

open in gitpod

This page demonstrates using RAGStack and a vector-enabled Astra DB Serverless database to build a Hotels Search application.

The application uses a vector-enabled Astra DB Serverless database to store hotel data, and RAGStack to search for hotels and generate summaries.

See the Hotels App README for more details on getting the app running (including on Gitpod).

Prerequisites

  1. Clone the Git repository and change to that directory.

    git clone https://github.com/DataStax-Examples/langchain-astrapy-hotels-app.git
    cd langchain-astrapy-hotels-app
  2. You will need a vector-enabled Astra DB Serverless database.

    1. Create an Astra vector database.

    2. Within your database, create an Astra DB Access Token with Database Administrator permissions.

    3. Copy your Astra DB Serverless API Endpoint for the vector-enabled Astra DB Serverless database, as displayed in Astra Portal.

  3. Set the following environment variables in a .env file in langchain-astrapy-hotels-app (you can use the provided .env.template as an example):

    OPENAI_API_KEY=sk-...
    ASTRA_DB_API_ENDPOINT=https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
    ASTRA_DB_APPLICATION_TOKEN=AstraCS:...
  4. Install the required dependencies:

    pip install -r requirements.txt
  5. Verify you have a recent version (7.0+) of npm (needed to run the client):

    npm --version

See the Prerequisites page for more details on finding these values.

Load the data

  1. From the root folder, run four Python scripts to populate your database with data collections.

    • Python

    • Result

    python -m setup.2-populate-review-vector-collection
    python -m setup.3-populate-hotels-and-cities-collections
    python -m setup.4-create-users-collection
    python -m setup.5-populate-reviews-collection
    ** [JustPreCalculatedEmbeddings] INFO: embed request for 'This is a sample sentence.'. Returning moot results
    
    [2-populate-review-vector-collection.py] Finished. 10000 rows written.
    [3-populate-hotels-and-cities-collections.py] Inserted 1433 hotels
    [3-populate-hotels-and-cities-collections.py] Inserted 842 cities
    [5-populate-reviews-collection.py] Inserted 10000 reviews
  2. Each script populates a different collection in your vector-enabled Astra DB Serverless database, including a collection of precalculated embeddings for vector search.

The application will use these collections to deliver valuable, personalized results to users.

Run the application

Now that your vector database is populated, run the application frontend to see the results.

  1. Open a new terminal and start the API server.

    uvicorn api:app --reload
  2. Open a new terminal and change directory to the client folder (cd client). Install the node dependencies and start the application.

    npm install
    npm start
  3. Open http://localhost:3000 to view the application in your browser. Click "Login" in the upper right corner, enter any values for the username and password, and click Login.

  4. Enter US for the country and a US city for the location, and click Search.

  5. The application lists hotels, including an OpenAI-generated summary of reviews from the reviews collection.

  6. Selecting "Details" will show more information about the hotel, including a summary based on your Preferences, stored in the users collection.

  7. The "Preferences" section lets you edit your profile, so that the "Details" for a hotel will be re-calculated, possibly highlighting different reviews and adapting the AI-generated summary at the top.

Hotels

Cleanup

  1. Use ctrl+c in both terminals, to stop the API server and the application.

  2. Launch the following cleanup script to delete all collections used by the application:

    python -m setup.cleanup-delete-all-collections

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com