Build a RAG command line chatbot

query_builder 20 min

This tutorial demonstrates how to build a command line chatbot. The chatbot uses data from your Astra collection for retrieval-augmented generation (RAG) with OpenAI.

This example uses the collection of book summaries from the quickstart, but you can use any collection of documents that have the $vectorize field populated with the text you want used as context when answering questions.

Prerequisites

Python
TypeScript
Java

An endpoint and application token for a Serverless (Vector) database, and a collection in that database with documents that have the $vectorize field populated.

If you don’t already have this, follow the quickstart.
An OpenAI API key.
Python version 3.8 or later.

An endpoint and application token for a Serverless (Vector) database, and a collection in that database with documents that have the $vectorize field populated.

If you don’t already have this, follow the quickstart.
An OpenAI API key.
Node version 18 or later.
TypeScript version 5 or later. If you are using JavaScript instead of TypeScript, you don’t need TypeScript installed.

An endpoint and application token for a Serverless (Vector) database, and a collection in that database with documents that have the $vectorize field populated.

If you don’t already have this, follow the quickstart.
An OpenAI API key.
Java version 17 or later. DataStax recommends Java 21.
Maven version 3.9 or later, or Gradle version 11 or later.

Store your credentials

For this tutorial, store your database endpoint, database application token, and OpenAI API key in environment variables:

Linux or macOS
Windows

export API_ENDPOINT=API_ENDPOINT
export APPLICATION_TOKEN=APPLICATION_TOKEN
export OPENAI_API_KEY=OPENAI_API_KEY

set API_ENDPOINT=API_ENDPOINT

set APPLICATION_TOKEN=APPLICATION_TOKEN

set OPENAI_API_KEY=OPENAI_API_KEY

Install packages

Python
TypeScript
Java

Install the astrapy and openai packages.

For example:

pip install astrapy openai

Install the @datastax/astra-db-ts and openai packages.

For example:

npm install @datastax/astra-db-ts openai

Add a dependency to the @datastax/astra-db-java and openai packages.

For example:

Maven
Gradle

pom.xml

<dependencies>
  <dependency>
    <groupId>com.datastax.astra</groupId>
    <artifactId>astra-db-java</artifactId>
    <version>2.1.4</version>
  </dependency>
  <dependency>
    <groupId>com.openai</groupId>
    <artifactId>openai-java</artifactId>
    <version>4.0.0</version>
  </dependency>
</dependencies>

build.gradle(.kts)

dependencies {
    implementation 'com.datastax.astra:astra-db-java:2.1.4'
    implementation 'com.openai:openai-java:4.0.0'
}

Add the code

Python
TypeScript
Java

import os
import sys

from astrapy import DataAPIClient
from openai import OpenAI


def main() -> None:
    endpoint = os.environ.get("API_ENDPOINT")  (1)
    application_token = os.environ.get("APPLICATION_TOKEN")
    openai_api_key = os.environ.get("OPENAI_API_KEY")

    keyspace = "default_keyspace"  (2)
    collection_name = "quickstart_collection"  (3)

    if not endpoint or not application_token or not openai_api_key:
        raise RuntimeError(
            "Environment variables API_ENDPOINT, APPLICATION_TOKEN, OPENAI_API_KEY must be defined."
        )

    # Instantiate the DataAPIClient and get a reference to your collection
    client = DataAPIClient()
    database = client.get_database(endpoint, token=application_token, keyspace=keyspace)
    collection = database.get_collection(collection_name)

    # Instantiate the OpenAI client
    openai = OpenAI(api_key=openai_api_key)

    # This list of messages will be sent to OpenAI with every query.
    # It starts with a single system prompt and grows as the chat progresses.
    messages = [
        {
            "role": "system",
            "content": "You are an AI assistant that can answer questions based on the the context you are given. Don't mention the context, just use it to inform your answers.",
        }
    ]

    # Start the chat by writing a message to the CLI
    print(
        "Greetings! I am an AI assistant that is ready to help you with your questions. "
        "You can ask me anything you like.\n"
        'If you want to exit, type ".exit".\n'
    )

    user_input = input("> ")

    # Run this loop continuously until the user inputs the exit command
    while user_input.lower() != ".exit":
        # If the user didn't input text, re-prompt them
        if user_input.strip() == "":
            user_input = input("> ")
            continue

        try:
            # Perform a vector search in your collection,
            # using the user input as the search string to vectorize.
            # Limit the search to 10 documents.
            # Use a projection to return just the $vectorize field of each document.
            cursor = collection.find(
                {},
                sort={"$vectorize": user_input},
                limit=10,
                projection={"$vectorize": 1},
            )

            # Join the $vectorize fields of the returned documents into a single string
            docs = cursor.to_list()
            context = "\n".join((doc.get("$vectorize") or "") for doc in docs).strip()

            # Combine the user question with the context from vector search
            rag_message = {
                "role": "user",
                "content": (
                    f"{context}\n---\n"
                    "Given the above context, answer the following question:\n"
                    f"{user_input}"
                ),
            }

            # Send the list of previous messages, plus the context augmented question
            # to OpenAI and stream the response
            stream = openai.chat.completions.create(
                model="gpt-4o-mini",
                messages=[*messages, rag_message],
                stream=True,
            )

            # Write OpenAI's response to the CLI as it comes in,
            # and also record it in a string
            message = ""
            for chunk in stream:
                delta = (chunk.choices[0].delta.content or "") if chunk.choices else ""
                if delta:
                    sys.stdout.write(delta)
                    sys.stdout.flush()
                    message += delta

            # Record the user question, without the added context, in the list of messages
            messages.append({"role": "user", "content": user_input})

            # Record the OpenAI response in the list of messages
            messages.append({"role": "assistant", "content": message})

            # Prompt the user for their next question
            user_input = input("\n\n> ")

        except Exception as e:
            print(str(e), file=sys.stderr)
            user_input = input("\nSomething went wrong, try asking again\n\n> ")


if __name__ == "__main__":
    try:
        main()
    except Exception as err:
        print(str(err), file=sys.stderr)
        sys.exit(1)

1	Store your database’s endpoint, application token, and OpenAI key in environment variables named `API_ENDPOINT`, `APPLICATION_TOKEN`, and `OPENAI_API_KEY`, as instructed in Store your credentials.
2	Change the keyspace name if your collection is in a different keyspace.
3	Change the collection name if you are not using the collection created in the quickstart.

import { DataAPIClient } from "@datastax/astra-db-ts";
import OpenAI from "openai";

import { createInterface } from "node:readline/promises";
import { stdin as input, stdout as output } from "node:process";

export async function main() {
  const {
    API_ENDPOINT: endpoint,
    APPLICATION_TOKEN: applicationToken,
    OPENAI_API_KEY: openaiApiKey,
  } = process.env; (1)
  const keyspace = "default_keyspace"; (2)
  const collectionName = "quickstart_collection"; (3)

  if (!endpoint || !applicationToken || !openaiApiKey) {
    throw new Error(
      "Environment variables API_ENDPOINT, APPLICATION_TOKEN, OPENAI_API_KEY must be defined.",
    );
  }

  // Instantiate the DataAPIClient and get a reference to your collection
  const client = new DataAPIClient();
  const database = client.db(endpoint, { token: applicationToken, keyspace });
  const collection = database.collection(collectionName);

  // Instantiate the OpenAI client
  const openai = new OpenAI({
    apiKey: openaiApiKey,
  });

  // This list of messages will be sent to OpenAI with every query.
  // It starts with a single system prompt and grows as the chat progresses.
  const messages: OpenAI.ChatCompletionMessageParam[] = [
    {
      role: "system",
      content:
        "You are an AI assistant that can answer questions based on the the context you are given. Don't mention the context, just use it to inform your answers.",
    },
  ];

  // Use the built-in Node.js readline to implement the CLI
  const cli = createInterface({ input, output });

  // Start the chat by writing a message to the CLI
  let userInput = await cli.question(
    `Greetings! I am an AI assistant that is ready to help you with your questions. You can ask me anything you like.\nIf you want to exit, type ".exit".\n\n> `,
  );

  // Run this loop continuously until the user inputs the exit command
  while (userInput.toLowerCase() !== ".exit") {
    // If the user didn't input text, re-prompt them
    if (userInput.trim() === "") {
      userInput = await cli.question("> ");
      continue;
    }

    try {
      // Perform a vector search in your collection,
      // using the user input as the search string to vectorize.
      // Limit the search to 10 documents.
      // Use a projection to return just the $vectorize field of each document.
      const response = collection.find(
        {},
        {
          sort: { $vectorize: userInput },
          limit: 10,
          projection: { $vectorize: 1 },
        },
      );

      // Join the $vectorize fields of the returned documents into a single string
      const context = (await response.toArray())
        .map((doc) => doc.$vectorize)
        .join("\n");

      // Combine the user question with the context from vector search
      const ragMessage: OpenAI.ChatCompletionUserMessageParam = {
        role: "user",
        content: `${context}\n---\nGiven the above context, answer the following question:\n${userInput}`,
      };

      // Send the list of previous messages, plus the context augmented question
      // to OpenAI and stream the response
      const stream = await openai.chat.completions.create({
        model: "gpt-4o-mini",
        messages: [...messages, ragMessage],
        stream: true,
      });

      // Write OpenAI's response to the CLI as it comes in,
      // and also record it in a string
      let message = "";
      for await (const chunk of stream) {
        const delta = chunk.choices[0]?.delta?.content ?? "";
        output.write(delta);
        message += delta;
      }

      // Record the user question, without the added context, in the list of messages
      messages.push({ role: "user", content: userInput });

      // Record the OpenAI response in the list of messages
      messages.push({ role: "assistant", content: message });

      // Prompt the user for their next question
      userInput = await cli.question("\n\n> ");
    } catch (error) {
      if (error instanceof Error) {
        console.error(error.message);
      }
      userInput = await cli.question(
        "\nSomething went wrong, try asking again\n\n> ",
      );
    }
  }

  cli.close();
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

1	Store your database’s endpoint, application token, and OpenAI key in environment variables named `API_ENDPOINT`, `APPLICATION_TOKEN`, and `OPENAI_API_KEY`, as instructed in Store your credentials.
2	Change the keyspace name if your collection is in a different keyspace.
3	Change the collection name if you are not using the collection created in the quickstart.

package com.example;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Projection;
import com.datastax.astra.client.core.query.Sort;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.StreamResponse;
import com.openai.models.ChatModel;
import com.openai.models.chat.completions.ChatCompletionAssistantMessageParam;
import com.openai.models.chat.completions.ChatCompletionChunk;
import com.openai.models.chat.completions.ChatCompletionCreateParams;
import com.openai.models.chat.completions.ChatCompletionMessageParam;
import com.openai.models.chat.completions.ChatCompletionSystemMessageParam;
import com.openai.models.chat.completions.ChatCompletionUserMessageParam;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;

public class Example {

  private static final String KEYSPACE = "default_keyspace"; (1)
  private static final String COLLECTION_NAME = "quickstart_collection"; (2)

  public static void main(String[] args) {
    try {
      run();
    } catch (Exception e) {
      System.err.println(e.getMessage());
      System.exit(1);
    }
  }

  private static void run() throws Exception {
    String endpoint = System.getenv("API_ENDPOINT"); (3)
    String applicationToken = System.getenv("APPLICATION_TOKEN");
    String openaiApiKey = System.getenv("OPENAI_API_KEY");

    if (endpoint == null
        || endpoint.isBlank()
        || applicationToken == null
        || applicationToken.isBlank()
        || openaiApiKey == null
        || openaiApiKey.isBlank()) {
      throw new IllegalStateException(
          "Environment variables API_ENDPOINT, APPLICATION_TOKEN, OPENAI_API_KEY must be defined.");
    }

    // Instantiate the DataAPIClient and get a reference to your collection
    Collection<Document> collection =
        new DataAPIClient(applicationToken)
            .getDatabase(endpoint, KEYSPACE)
            .getCollection(COLLECTION_NAME);

    // Instantiate the OpenAI client
    OpenAIClient openai = OpenAIOkHttpClient.builder().apiKey(openaiApiKey).build();

    // This list of messages will be sent to OpenAI with every query.
    // It starts with a single system prompt and grows as the chat progresses.
    List<ChatCompletionMessageParam> history = new ArrayList<>();
    history.add(
        ChatCompletionMessageParam.ofSystem(
            ChatCompletionSystemMessageParam.builder()
                .content(
                    "You are an AI assistant that can answer questions based on the the context you are given. "
                        + "Don't mention the context, just use it to inform your answers.")
                .build()));

    BufferedReader cli =
        new BufferedReader(new InputStreamReader(System.in, StandardCharsets.UTF_8));

    // Start the chat by writing a message to the CLI
    System.out.print(
        "Greetings! I am an AI assistant that is ready to help you with your questions. You can ask me anything you like.\n"
            + "If you want to exit, type \".exit\".\n\n> ");

    String userInput = cli.readLine();

    // Run this loop continuously until the user inputs the exit command
    while (userInput != null && !userInput.equalsIgnoreCase(".exit")) {
      // If the user didn't input text, re-prompt them
      if (userInput.trim().isEmpty()) {
        System.out.print("> ");
        userInput = cli.readLine();
        continue;
      }

      try {
        // Perform a vector search in your collection,
        // using the user input as the search string to vectorize.
        // Limit the search to 10 documents.
        // Use a projection to return just the $vectorize field of each document.
        CollectionFindOptions options =
            new CollectionFindOptions()
                .sort(Sort.vectorize(userInput))
                .limit(10)
                .projection(Projection.include("$vectorize"));
        CollectionFindCursor<Document, Document> response = collection.find(null, options);

        // Join the $vectorize fields of the returned documents into a single string
        StringBuilder contextBuilder = new StringBuilder();
        for (Document document : response) {
          String chunk = document.getString("$vectorize");
          if (chunk != null && !chunk.isBlank()) {
            if (contextBuilder.length() > 0) contextBuilder.append("\n");
            contextBuilder.append(chunk);
          }
        }
        String context = contextBuilder.toString();

        // Combine the user question with the context from vector search
        String ragMessage =
            context
                + "\n---\nGiven the above context, answer the following question:\n"
                + userInput;

        // Send the list of previous messages, plus the context augmented question, to OpenAI
        List<ChatCompletionMessageParam> requestMessages = new ArrayList<>(history);
        requestMessages.add(
            ChatCompletionMessageParam.ofUser(
                ChatCompletionUserMessageParam.builder().content(ragMessage).build()));

        ChatCompletionCreateParams params =
            ChatCompletionCreateParams.builder()
                .model(ChatModel.GPT_4O_MINI)
                .messages(requestMessages)
                .build();

        // Write OpenAI's response to the CLI as it comes in,
        // and also record it in a string
        StringBuilder message = new StringBuilder();

        try (StreamResponse<ChatCompletionChunk> stream =
            openai.chat().completions().createStreaming(params)) {

          stream.stream()
              .forEach(
                  chunk -> {
                    if (chunk.choices() == null || chunk.choices().isEmpty()) return;

                    String delta =
                        chunk.choices().get(0).delta() != null
                            ? chunk.choices().get(0).delta().content().orElse("")
                            : "";

                    if (!delta.isEmpty()) {
                      System.out.print(delta);
                      message.append(delta);
                    }
                  });
        }

        // Record the user question, without the added context, in the list of messages
        history.add(
            ChatCompletionMessageParam.ofUser(
                ChatCompletionUserMessageParam.builder().content(userInput).build()));

        // Record the OpenAI response in the list of messages
        history.add(
            ChatCompletionMessageParam.ofAssistant(
                ChatCompletionAssistantMessageParam.builder().content(message.toString()).build()));

        // Prompt the user for their next question
        System.out.print("\n\n> ");
        userInput = cli.readLine();

      } catch (Exception e) {
        System.err.println(e.getMessage() != null ? e.getMessage() : e);
        System.out.print("\nSomething went wrong, try asking again\n\n> ");
        userInput = cli.readLine();
      }
    }
  }
}

1	Change the keyspace name if your collection is in a different keyspace.
2	Change the collection name if you are not using the collection created in the quickstart.
3	Store your database’s endpoint, application token, and OpenAI key in environment variables named `API_ENDPOINT`, `APPLICATION_TOKEN`, and `OPENAI_API_KEY`, as instructed in Store your credentials.

Test the code

From your terminal, run the code from the previous section.

The terminal should show the welcome message and a > prompt.
Enter a question. For example, Can you recommend a book set on another planet?

The terminal should print the answer from OpenAI, and give the > prompt again.
To exit, type .exit.

Next steps

If you used the the quickstart collection, try making a collection of other data and using that instead.
If the user asks a question unrelated to the collection contents, the vector search still returns 10 documents. However, the similarity scores for these documents will be low, and the documents won’t be relevant to the question.

In this tutorial, similarity scores are not requested, and low-similarity results are still included in the context that is sent to OpenAI.

You can use the includeSimilarity option to return a similarity score for each document. Then, you can omit results with a low similarity score from the context, or prompt the user to ask a more relevant question.
Right now, the message list grows infinitely as the chat progresses. You can truncate the message list so that older messages are discarded.

Build a RAG command line chatbot

Prerequisites

Store your credentials

Install packages

Add the code

Test the code

Next steps

Was this helpful?

Give Feedback