Use the Astra DB component

To integrate your Astra DB Serverless databases into your Langflow flows, use the Astra DB component.

Through this component, you can store and retrieve vector embeddings in an Astra DB Serverless vector store, including similarity searches and metadata filtering.

This guide explains how to configure the Astra DB component to store and retrieve embeddings in a complete flow.

Prerequisites

Application token: You need an Astra application token to access your Astra DB Serverless databases through Langflow. The token must have a role with the appropriate read/write permissions for the operations you want to perform.
Databases and collections: DataStax recommends that you create any databases, keyspaces, and collections you need before configuring the Astra DB component.

You can create new databases and collections through the Astra DB component, but you must wait for the database or collection to initialize before proceeding with flow configuration. Additionally, some database and collection configuration options aren’t available through the Astra DB component, such as hybrid search options, PCU groups, vectorize integration management, and multi-region deployments.
Embedding provider API keys: If you aren’t using the NVIDIA vectorize integration, you need an API key for your chosen embedding provider, such as OpenAI.

If you want to use a vectorize integration other than NVIDIA, you must activate the integration in your Astra organization before you configure the Astra DB component. For more information, see Auto-generate embeddings with vectorize.

Create a flow

In the Astra Portal header, switch your active app from Astra to Langflow.
In Langflow, click New Flow, and then select Blank Flow.

For a pre-built example, use the Vector Store RAG template.

Configure the Astra DB component

Click and drag the Astra DB component from the Components menu to the Workspace.
In the Astra DB Application Token field, add your Astra DB application token.

The component immediately uses the token to connect to your database, and then populates the parameter menus with the databases and collections that the token has access to.
Select your Database.

If you don’t have a database, select New database, enter the Name, Cloud provider, and Region fields, and then click Create.

Database creation takes several minutes. You cannot continue configuring the component until the database is ready.

Astra organizations on the Free plan can create up to five databases. If you reach the limit, the Create new database option becomes inactive. To re-enable database creation, either terminate an existing database or upgrade your plan.

Select or create a Collection.

If you select a collection with a vectorize integration, the Embedding Model parameter is hidden because your database uses the integration to generate embeddings. If your collection doesn’t have a vectorize integration, you must add an embedding model component to the flow, and then connect it to the Astra DB component.

Configuration options for new collections

If you choose to create a new collection, enter the Name, Embedding generation method, Embedding model, and Dimensions, and then click Create.

Your choice for the Embedding generation method and Embedding model depends on whether you want to use a vectorize integration or not:

To use a vectorize integration, select the provider from the Embedding generation method menu, and then select the model from the Embedding model menu.
To bring your own embeddings, select Bring your own for both the Embedding generation method and Embedding model fields. If you are using the Vector Store RAG template, the OpenAI Embeddings component is included in the template.

The Dimensions value must match the dimensions of the embeddings you plan to store in your collection. Some vectorize integrations don’t require you to specify this value if the dimension is set by default.

For more information, see Auto-generate embeddings with vectorize.

To use the Astra DB component to read from your database, populate the Search Input field. You can enter your query directly, or connect another component to pass in a query dynamically. For example, connect an Input component to provide a query at runtime.

An Output component is recommended to view the result of your query as a message rather than the raw response.

Load data into your Astra DB Serverless database with Langflow

The Astra DB component can write data and embeddings to your database.

If you aren’t using the Vector Store RAG template, use these steps to create a secondary flow that loads a file, splits the text into chunks, generates embeddings for the chunks, and then stores the embeddings and chunks in your database. This flow prepares your database for vector searches by loading content.

Add another Astra DB component, but don’t enter or attach anything to the Search Input field.

This secondary flow is only used to write data to your database, so the Search Input field isn’t needed.
Add a File component and Split Text component to your flow, and then configure them:
- In the File component, select a file to load from your local file system.
- Connect the File component’s output port to the Split Text component’s Data Inputs port.
- In the Split Text component, set the Chunk Size, Chunk Overlap, and Separator parameters to control how the text is split into smaller chunks.
- Connect the Split Text component’s output port to the Astra DB component’s Ingest Data port.
If you aren’t using a vectorize integration, make sure your flow has an embedding model component that is connected to the Astra DB component’s Embedding Model port.

The selected embedding model must have the same vector dimensions as your Astra DB collection.

You must provide a valid API key for your chosen embedding provider, and the key must have access to the selected embedding model. You can store your API key as a global variable to reuse it in multiple flows.
To process and write the chunks to your database, click play_arrow Play on the Astra DB component that is attached to your Split Text component.

Your database is now ready for vector searches.

In the Astra Portal, you can view the data in your database’s Data Explorer.

Run a vector search with your flow

If your flow has a Chat input component, you can open the Playground to run the flow. Your chat message is used to run a vector search and generate a response based on the data in your database.

If your flow doesn’t have a Chat Input component, you can click play_arrow Play on the Astra DB component that has the Search Input field populated.

Langflow uses the Data API to query your database with the given input, and then return a response. If your flow doesn’t have an Output component, you can see the raw response in the flow and component logs.