Local Development Environment

Set up a local development environment and install RAGStack.

Venv

  1. Install Python 3.11 or higher.

  2. Create a virtual environment, activate it, and install RAGStack.

    python -m venv <venv-name>
    source <venv-name>/bin/activate
    pip install ragstack-ai
  3. Once you’re satisfied with your local environment, freeze its dependencies to a requirements.txt file. This file can then be used to recreate the environment elsewhere:

    pip freeze > requirements.txt
  4. To take your local environment to a production setting, create a new virtual environment and install the dependencies from the requirements.txt file:

    pip install -r requirements.txt
  5. To deactivate the virtual environment, type deactivate.

Conda

  1. Install Conda or Miniconda.

  2. Create a virtual environment, activate it, and install RAGStack.

    conda create --name <venv-name>
    conda activate <venv-name>
    pip install ragstack-ai
  3. Once you’re satisfied with your local environment, export it to a YAML file. This file can then be used to recreate the environment elsewhere:

    conda env export > environment.yml

    It’s a good practice to keep environment.yml in version control to ensure reproducibility.

  4. To take your local environment to a production setting, create a new conda virtual environment from the environment.yml file:

    conda create --name prod-ragstack --clone ragstack-venv

    This will create a new conda environment with the same packages and versions as your local environment.

  5. To deactivate the virtual environment, type conda deactivate.

Poetry

Poetry requires Python 3.8+.

  1. Install Poetry.

  2. If you already have a poetry.lock file, use poetry add ragstack-ai to add RAGStack to your project. If not, continue to the next step.

  3. In your application directory, run poetry init to create a pyproject.toml file. Poetry will ask you a few questions about your project and create the .toml file for you.

    poetry init
    
    This command will guide you through creating your pyproject.toml config.
    
    Package name [temporary-astra]:
    Version [0.1.0]:
    Description []:
    Author [Mendon Kissling <59585235+mendonk@users.noreply.github.com>, n to skip]:
    License []:
    Compatible Python versions [^3.11]:
    
    Would you like to define your main dependencies interactively? (yes/no) [yes] yes
    Package to add or search for (leave blank to skip): ragstack-ai
    Enter package # to add, or the complete package name if it is not listed []:
     [ 0] ragstack-ai
     > 0
    Enter the version constraint to require (or leave blank to use the latest version):
    Using version ^0.1.2 for ragstack-ai
  4. When asked Would you like to define your main dependencies interactively? (yes/no), type yes.

  5. When prompted Package to add or search for (leave blank to skip):, type ragstack-ai and leave the version constraint blank to use the latest version.

  6. Once the Poetry virtual environment is created, type poetry shell to activate it as a nested shell.

  7. Type poetry install. This command reads the pyproject.toml file, downloads the latest versions of the dependencies, and installs them in the virtual environment. All packages and their exact versions are written to the poetry.lock file, locking the project to those specific versions. You should commit the poetry.lock file to your project repo so that all people working on the project are locked to the same versions of dependencies.

    poetry install
    Updating dependencies
    Resolving dependencies...
    
    Package operations: 65 installs, 0 updates, 0 removals
    
      • Installing click (8.1.7)
    
    ...
    
    Writing lock file
    
    Installing the current project: temporary-astra (0.1.0)
  8. To deactivate the virtual environment, type exit.

Connect to your vector-enabled Astra DB Serverless database

RAGStack includes the Astrapy library for connecting your local development environment to your vector-enabled Astra DB Serverless database.

  1. If you don’t have a vector database, create one at https://astra.datastax.com/.

    The Astra application token must have Database Administrator permissions (e.g. AstraCS:WSnyFUhRxsrg…​).

    The Astra API endpoint is available in the Astra Portal (e.g. https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com).

    Create an OpenAI key at https://platform.openai.com/ (e.g. sk-xxxx).

    You must have an existing collection in Astra (e.g. test).

  2. Create a .env file in the root of your program with the values from your Astra Connect tab.

    ASTRA_DB_APPLICATION_TOKEN="<AstraCS:...>"
    ASTRA_DB_API_ENDPOINT="<Astra DB API endpoint>"
    OPENAI_API_KEY="sk-..."
    ASTRA_DB_COLLECTION="test"
  3. Test your connection to the database. Create a vector store and print the contents of the data collection:

    To install the load_dotenv package, run pip install python-dotenv.

    import os
    from dotenv import load_dotenv
    from langchain_astradb import AstraDBVectorStore
    from langchain_openai import OpenAIEmbeddings
    
    load_dotenv()
    
    ASTRA_DB_APPLICATION_TOKEN = os.environ.get("ASTRA_DB_APPLICATION_TOKEN")
    ASTRA_DB_API_ENDPOINT = os.environ.get("ASTRA_DB_API_ENDPOINT")
    OPEN_AI_API_KEY = os.environ.get("OPENAI_API_KEY")
    ASTRA_DB_COLLECTION = os.environ.get("ASTRA_DB_COLLECTION")
    
    embedding = OpenAIEmbeddings()
    vstore = AstraDBVectorStore(
        embedding=embedding,
        collection_name="test",
        token=os.environ["ASTRA_DB_APPLICATION_TOKEN"],
        api_endpoint=os.environ["ASTRA_DB_API_ENDPOINT"],
    )
    print(vstore.astra_db.collection(ASTRA_DB_COLLECTION).find())
  4. You should get the following output, indicating your collection contains no documents:

    {'data': {'documents': [], 'nextPageState': None}}
  5. With your local environment connected to your vector database, continue on to the quickstart to load data and start querying.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com