Integrate MongooseJS with Astra DB Serverless

query_builder 15 min

Many members of the JavaScript community work with document databases through Object Data Modeling (ODM) libraries. In particular, MongooseJS is a popular framework for ODM on top of document databases.

The Data API for Astra DB Serverless is compatible with MongooseJS. This topic explains how to connect MongooseJS to an Astra DB Serverless database.

In Astra Portal, you can quickly create an Astra DB Serverless database, and start coding with familiar paradigms and frameworks. When developing a MongooseJS application backed by Astra DB Serverless, you get full access to a database that is designed for:

  • Simultaneous search and update on distributed data and streaming workloads with ultra-low latency

  • Highly relevant vector search results that eliminate redundancies

This gives you the ease of use and familiarity of MongooseJS, combined with the vector support and scalability of Astra DB Serverless.

Simple configuration update

After making the following configuration change in Node.js apps that use MongooseJS, you can connect to your Astra DB Serverless database. Your apps can continue to perform MongooseJS commands as usual. In most cases, no other updates are needed. The change involves the setDriver function and adding isAstra: true to the uri parameter of mongoose.connect.

Here is the astradb-mongoose.js source:

const mongoose = require("mongoose");
const { driver, createAstraUri } = require("stargate-mongoose");

const connectToAstraDb = async () => {
  const uri = createAstraUri(
    process.env.ASTRA_DB_API_ENDPOINT,
    process.env.ASTRA_DB_APPLICATION_TOKEN,
  );

  mongoose.set("autoCreate", true);
  mongoose.setDriver(driver);

  await mongoose.connect(uri, {
    isAstra: true,
  });
};

module.exports = { connectToAstraDb };

You can find the source code for the movies app on GitHub.

Prerequisites

To get started with the Data API and Astra DB Serverless with MongooseJS:

  • Install Node.js 16.20.2 or later.

  • In Astra Portal, click Create Database.

  • With the Serverless (Vector) card highlighted, enter a database name. Then choose a cloud provider and region.

  • Wait for your Astra DB Serverless database to reach Active status.

Quickstart

  1. Optional: If you decide to use an OpenAI API key with this demo app, and you haven’t yet generated a key, sign into your OpenAI account, open the drop-down menu under your name, and then select View API keys. Click Create new secret key, copy the key, and store it securely.

  2. In the Astra Portal, go to Databases, and select your database.

  3. Make sure the database is in Active status, and then, in the Database Details section, click Generate Token.

  4. In the Application Token dialog, click Copy, and then store the token securely. The token format is AstraCS: followed by a unique token string.

    Application tokens created from Database Details have the Database Administrator role for the associated database.

  5. In Database Details, copy your database’s API endpoint. The endpoint format is https://ASTRA_DB_ID-ASTRA_DB_REGION.apps.astra.datastax.com.

  1. In your terminal, assign your token and API endpoint to environment variables.

    Linux or macOS
    export ASTRA_DB_API_ENDPOINT=API_ENDPOINT
    export ASTRA_DB_APPLICATION_TOKEN=TOKEN
    Windows
    set ASTRA_DB_API_ENDPOINT=API_ENDPOINT
    set ASTRA_DB_APPLICATION_TOKEN=TOKEN
    Google Colab
    import os
    os.environ["ASTRA_DB_API_ENDPOINT"] = "API_ENDPOINT"
    os.environ["ASTRA_DB_APPLICATION_TOKEN"] = "TOKEN"

    The MongooseJS driver appends /api/json/v1 to your API Endpoint value, so that requests sent to the cloud server use the Data API with your Astra DB Serverless database.

  2. In a terminal, run:

    npx create-astradb-mongoose-app@latest

    The npx script displays:

    • A prompt asking whether you want to use vector search in the app:

      ✔ Do you want to enable vector search functionality (you will need a funded OpenAI account)? … No / Yes
    • If you enter Yes, the script prompts for your OpenAI API key:

      ✔ Awesome! What is your OpenAI API key? … ***************************************************
  3. The npx script displays confirmations:

    added 73 packages, and audited 74 packages in 5s
    
    7 packages are looking for funding
      run `npm fund` for details
    
    found 0 vulnerabilities
    
    🎉 Congrats! You have successfully created a new Astra DB application with Mongoose!
    👉 Next steps:
       1. Go to the newly created project folder.
          cd astradb-mongoose-app
       2. Run the sample code.
          npm start
       3. Enjoy development!
          😍
  4. Go to the astradb-mongoose-app folder:

    cd astradb-mongoose-app
  5. Use npm to run the app:

    npm start
  6. The app loads a dataset into your Serverless (Vector) database with a few categorized movie summaries. The app can find a movie based on your favorite genre. Here’s an example where Western is selected from the list of movies:

    1️⃣  With the data loaded, I can find a movie based on your favorite genre.
    ? What kind of movie would you like to watch? › - Use arrow-keys. Return to submit.
        Comedy
        Drama
    ❯   Western
        Romance
    
    ✔ What kind of movie would you like to watch? › Western
    Sure! Here is an option for you:
      The Girl of the Golden West (Western, 1915)
      The Girl of the Golden West is a surviving 1915 American Western silent black-and-white film directed by Cecil B. DeMille. It was based on the 1905 play The Girl of the Golden West by David Belasco. Prints of the film survive in the Library of Congress film archive. It was the first of four film adaptations that have been made of the play.
  7. You can then use a vector search by entering a general query; that is, what you want to watch. Example:

    2️⃣  You can also simply describe what you are looking for, and I will find relevant movies. I will use vector search!
    ? Just tell me what you want to watch... › Something funny

    The app performs a vector search and returns three relevant results, such as:

      Laughing Gas (Comedy, 1907)
      Laughing Gas is the title of several short American movies whose plot revolves around real or would-be dentists.
      --
      Bob's Baby (Comedy, 1913)
      Bob's Baby is a 1913 American comedy film.
      --
      The Sanitarium (Comedy, 1910)
      The Sanitarium is a 1910 short comedy film featuring Fatty Arbuckle.
  8. You can then combine a genre with a general search:

    3️⃣  Finally, let's combine the two...
    ✔ First, what genre are you interested in? › Drama
    ✔ And now describe to me what you are looking for... Detectives and criminals
    
      Here are the two most relevant movies based on your request:
      The Criminal Hypnotist (Drama, 1909)
      The Criminal Hypnotist is a 1909 American silent short film directed by D. W. Griffith.
      --
      The Honor of Thieves (Drama, 1909)
      The Honor of Thieves is a 1909 American short silent drama film directed by D. W. Griffith.

Code-level details

Now let’s look at the files that comprise this Node.js movies app. In an editor such as Visual Code Studio, navigate to the astradb-mongoose-app folder you’ve used in this quickStart.

The source files are also available in this GitHub repo.

astradb-mongoose.js

The Node.js code includes astradb-mongoose.js.

Notice how astradb-mongoose.js defines setDriver and isAstra: true to declare that the app uses an Astra DB Serverless database on the backend.

This astradb-mongoose.js module provides a function for establishing a connection to an Astra DB Serverless database using MongooseJS. Key features:

  • Import Dependencies: The module imports MongooseJS for database commands and { driver, createAstraUri } from stargate-mongoose to handle the connection with your Astra DB Serverless.

  • Environment Variables: The connectToAstraDb function reads environment variables (process.env) to configure the database connection. Again, the variables are:

    • ASTRA_DB_API_ENDPOINT: The https://…​ path to using the Data API with your Astra DB Serverless database.

    • ASTRA_DB_APPLICATION_TOKEN: Authentication token for the application.

  • Construct URI: Uses createAstraUri to construct the connection URI dynamically using the provided environment variables.

  • Configure MongooseJS:

    • mongoose.set("autoCreate", true) enables auto-creation of the database schema.

    • mongoose.setDriver(driver) sets a custom database driver provided by stargate-mongoose.

  • Connect to Astra DB: Establishes an asynchronous connection to Astra DB using mongoose.connect() with the constructed URI and an option flag isAstra: true.

  • Export: Finally, the function connectToAstraDb is exported as a module to be used in other parts of the application.

In summary, this module acts as a reusable utility for connecting to Astra DB, abstracting away the details and allowing other parts of your application to connect to the database by simply invoking connectToAstraDb().

Here is the astradb-mongoose.js source:

const mongoose = require("mongoose");
const { driver, createAstraUri } = require("stargate-mongoose");

const connectToAstraDb = async () => {
  const uri = createAstraUri(
    process.env.ASTRA_DB_API_ENDPOINT,
    process.env.ASTRA_DB_APPLICATION_TOKEN,
  );

  mongoose.set("autoCreate", true);
  mongoose.setDriver(driver);

  await mongoose.connect(uri, {
    isAstra: true,
  });
};

module.exports = { connectToAstraDb };

app.js

Let’s look at app.js, which is the Node.js app with mongoose commands to process the queries.

The app.js file is a MongooseJS implementation entirely: it creates the model, loads the data, and submits queries. There is nothing specific to Astra DB Serverless, or that needs to be changed for Astra DB Serverless.

This app implements the following key features:

  • Environment Variable Configuration: Using the dotenv package, the application reads environment variables from a .env file.

  • MongooseJS and Astra DB Serverless Setup: The script uses MongooseJS to define a schema for movies and connect to Astra DB Serverless, facilitated by the imported connectToAstraDb function.

  • Data Loading: The loadData function:

    • Drops the existing movies collection in the database, if it exists.

    • Defines a new schema for movies, including a special $vector field that is used for vector-based searches.

    • Inserts movies from a local JSON file into the database in batches of 20.

  • User Interaction: Uses the prompts library to get user input, and the chalk library to colorize output. It has three main search features:

    • findMovieByGenre: Asks the user to select a genre and then finds a movie in that genre.

    • findMovieByDescription: Asks the user for a description, then utilizes a generateEmbedding function (leveraging OpenAI’s API in this case) to perform a semantic vector search.

    • findMovieByGenreAndDescription: Combines both the genre and the description-based searches.

  • Search Sorting and Limiting: The application uses the $vector field in the database for semantic matching and ranking of search results.

  • OpenAI API Integration: The script has optional OpenAI API functionality for semantic searches. If the API key is available, it enables the description-based and combined searches.

  • Error Handling: Catches and displays errors, highlighting them in red text using chalk.

  • Asynchronous Execution: All database commands and user interactions are performed asynchronously using async/await.

Here’s the app.js source:

require("dotenv").config();

const prompts = require("prompts");
const chalk = require("chalk");
const mongoose = require("mongoose");
const { connectToAstraDb } = require("./astradb-mongoose");
const { generateEmbedding, movieToString, moviesToString } = require("./util");

// Create Mongoose "movies" collection
const loadData = async () => {
  console.log(
    "Dropping existing collection " +
      chalk.bold.cyan("movies") +
      " if it exists...",
  );
  await mongoose.connection.dropCollection("movies");

  console.log(
    "Creating Mongoose collection " +
      chalk.bold.cyan("movies") +
      " and loading data from " +
      chalk.bold.cyan("movies.json") +
      "...",
  );
  const Movie = mongoose.model(
    "Movie",
    new mongoose.Schema(
      {
        title: String,
        year: Number,
        genre: String,
        description: String,
        $vector: {
          type: [Number],
          validate: (vector) => vector && vector.length === 1536,
        },
      },
      {
        collectionOptions: {
          vector: {
            size: 1536,
            function: "cosine",
          },
        },
      },
    ),
  );
  await Movie.init();

  const movies = require("./movies.json");
  console.log(
    "Inserting " + movies.length + " movies including vector embeddings... \n",
  );
  for (let i = 0; i < movies.length; i += 20) {
    await Movie.insertMany(movies.slice(i, i + 20));
  }
};

// "findOne()" pattern using movie 'genre', no vector search
const findMovieByGenre = async () => {
  const { genre } = await prompts({
    type: "select",
    name: "genre",
    message: "What kind of movie would you like to watch?",
    choices: [
      { title: "Comedy", value: "Comedy" },
      { title: "Drama", value: "Drama" },
      { title: "Western", value: "Western" },
      { title: "Romance", value: "Romance" },
    ],
  });

  const movie = await mongoose.model("Movie").findOne({ genre });

  console.log(`Sure! Here is an option for you:
${movieToString(movie)}
`);
};

// "find()" pattern using vector search, generates an embedding from user input
const findMovieByDescription = async () => {
  const { prompt } = await prompts({
    type: "text",
    name: "prompt",
    message: "Just tell me what you want to watch...",
  });

  const embedding = await generateEmbedding(prompt);

  const movies = await mongoose
    .model("Movie")
    .find(
      {},
      { title: 1, genre: 1, year: 1, description: 1 },
      { includeSimilarity: true },
    )
    .sort({ $vector: { $meta: embedding } })
    .limit(3);

  console.log(`Here are three most relevant movies based on your request:
${moviesToString(movies)}
`);
};

// Hybrid "find()" pattern using movie 'genre' and vector search, generates an embedding from user input
const findMovieByGenreAndDescription = async () => {
  const { genre, prompt } = await prompts([
    {
      type: "select",
      name: "genre",
      message: "First, what genre are you interested in?",
      choices: [
        { title: "Comedy", value: "Comedy" },
        { title: "Drama", value: "Drama" },
        { title: "Western", value: "Western" },
        { title: "Romance", value: "Romance" },
      ],
    },
    {
      type: "text",
      name: "prompt",
      message: "And now describe to me what you are looking for...",
    },
  ]);

  const embedding = await generateEmbedding(prompt);

  const movies = await mongoose
    .model("Movie")
    .find(
      { genre },
      { title: 1, genre: 1, year: 1, description: 1 },
      { includeSimilarity: true },
    )
    .sort({ $vector: { $meta: embedding } })
    .limit(3);

  console.log(`Here are two most relevant movies based on your request:
${moviesToString(movies)}
`);
};

(async function () {
  try {
    console.log(
      "0️⃣  Connecting to Astra Vector DB using the following values from your configuration..." +
        "\n" +
        chalk.bold.cyan("ASTRA_DB_API_ENDPOINT") +
        " = " +
        process.env.ASTRA_DB_API_ENDPOINT +
        "\n" +
        chalk.bold.cyan("ASTRA_DB_APPLICATION_TOKEN") +
        " = " +
        process.env.ASTRA_DB_APPLICATION_TOKEN.substring(0, 13) +
        "...\n\n",
    );
    await connectToAstraDb();

    console.log("0️⃣  Loading the data to Astra DB (~20s)...");
    await loadData();

    console.log(
      "1️⃣  With the data loaded, I can find a movie based on your favorite genre.",
    );
    await findMovieByGenre();

    if (process.env.OPENAI_API_KEY) {
      console.log(
        "2️⃣  You can also simply describe what you are looking for, and I will find relevant movies. I will use vector search!",
      );
      await findMovieByDescription();

      console.log("3️⃣  Finally, let's combine the two...");
      await findMovieByGenreAndDescription();

      console.log(
        "Be sure to check out the " +
          chalk.bold.cyan("app.js") +
          " and " +
          chalk.bold.cyan("astradb-mongoose.js") +
          " files for code examples using the Data API. \n\nHappy Coding!",
      );
    } else {
      console.log(
        `🚫 I can't generate embeddings without an OpenAI API key.
   Please set the ${chalk.bold(
     "OPENAI_API_KEY",
   )} environment variable or review the code to see how you can use vector search.`,
      );
    }
  } catch (e) {
    console.error(chalk.bold.red("[ERROR] ") + e);
  }
})();

index.js

The index.js is a script for setting up a new Astra DB Serverless application using MongooseJS in Node.js.

  • #!/usr/bin/env node - This line allows the script to be run as a standalone executable using Node.js.

  • Module Imports: The script imports necessary Node.js modules:

    • fs for file system operations

    • path for handling file paths

    • prompts for interactive command-line prompts

    • chalk for styling console output

    • cross-spawn for running shell commands

  • Async Self-Invoking Function: The script is wrapped in an asynchronous self-invoking function to allow the use of await within the script.

  • Project Directory Setup:

    • It first resolves the project directory (astradb-mongoose-app) relative to the current working directory.

    • If the project directory already exists, it informs the user and suggests deleting the existing directory to start fresh.

    • Otherwise, it creates the new project directory, copying a template from the script’s directory.

  • Environment Variables and Prompts:

    • The script checks for existing environment variables ASTRA_DB_API_ENDPOINT and ASTRA_DB_APPLICATION_TOKEN.

    • If these are not set, it prompts the user to input them. These are essential for connecting to Astra DB.

    • It also asks if the user wants to enable vector search functionality with OpenAI; and if so, prompts for the OpenAI API key.

  • Writing Environment Variables:

    • It constructs a string with the necessary environment variables and writes this to a .env file in the project directory.

  • Dependency Installation:

    • The script then runs npm install in the project directory to install dependencies.

  • Completion Message:

    • Finally, it prints a congratulatory message, indicating successful creation of the application.

    • It provides instructions for next steps: navigating to the project folder, starting the application, and continuing development.

Overall, this script automates the boilerplate setup process for a new Astra DB Serverless application with Mongoose, making it easier and faster for you to start working on your application.

#!/usr/bin/env node

const fs = require("fs");
const path = require("path");
const prompts = require("prompts");
const chalk = require("chalk");
const spawn = require("cross-spawn");

(async function () {
  const projectDir = path.resolve(process.cwd(), "astradb-mongoose-app");

  if (fs.existsSync(projectDir)) {
    console.log(
      "The project already exists. Delete it and re-run the command if you want to create a new one from scratch.",
    );
    console.log("  " + chalk.bold("rm -rf astradb-mongoose-app"));

    return;
  }

  fs.mkdirSync(projectDir, { recursive: true });
  fs.cpSync(path.resolve(__dirname, "template"), projectDir, {
    recursive: true,
  });

  let apiEndpoint = process.env.ASTRA_DB_API_ENDPOINT;
  let applicationToken = process.env.ASTRA_DB_APPLICATION_TOKEN;

  const answers = await prompts([
    {
      type: apiEndpoint ? null : "text",
      name: "apiEndpoint",
      message: "What is your {data-api} endpoint?",
    },
    {
      type: applicationToken ? null : "password",
      name: "applicationToken",
      message: "What is your Astra DB application token?",
    },
    {
      type: "toggle",
      name: "useOpenAI",
      message:
        "Do you want to enable vector search functionality (you will need a funded OpenAI account)?",
      initial: true,
      active: "Yes",
      inactive: "No",
    },
    {
      type: (useOpenAI) => (useOpenAI ? "password" : null),
      name: "openAIKey",
      message: "Awesome! What is your OpenAI API key?",
    },
  ]);

  if (!apiEndpoint) {
    apiEndpoint = answers.apiEndpoint;

    if (!apiEndpoint) {
      throw new Error("Missing API endpoint.");
    }
  }

  if (!applicationToken) {
    applicationToken = answers.applicationToken;

    if (!applicationToken) {
      throw new Error("Missing application token.");
    }
  }

  let env = `ASTRA_DB_API_ENDPOINT=${apiEndpoint}\nASTRA_DB_APPLICATION_TOKEN=${applicationToken}`;

  if (answers.openAIKey) {
    env += `\nOPENAI_API_KEY=${answers.openAIKey}`;
  }

  fs.writeFileSync(path.join(projectDir, ".env"), env);

  spawn.sync("npm", ["install"], { cwd: projectDir, stdio: "inherit" });

  console.log(`
-----

🎉 Congrats! You have successfully created a new ${chalk.bold.red(
    "Astra DB",
  )} application with ${chalk.bold.magenta("Mongoose")}!
👉 Next steps:
   1. Go to the newly created project folder.
      ${chalk.bold("cd astradb-mongoose-app")}
   2. Run the sample code.
      ${chalk.bold("npm start")}
   3. Enjoy development!
      😍
`);
})();

movies.json

The movies.json file provides vector data used by the app.

Summary

In summary:

  • astradb-mongoose.js is responsible for setting up a database connection to Astra DB Serverless

  • app.js contains the application logic for:

    • Loading movie data into the database.

    • Allowing the user to find a movie by genre and perform a vector search.

    • Optionally using embeddings to present additional data based on the user’s new query, if the user opted to specify their OpenAI API token.

  • index.js presents the app’s prompts and confirms for the user that the Astra DB Serverless with MongooseJS app was created.

  • movies.json provides vector data used by the app.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com