Unstructured Document QA

Build an Unstructured Document QA flow for a chatbot application using the Unstructured loader component.

The Unstructured component loads and parses PDFs, TXTs, and DOCX files with the Unstructured Serverless API.

This flow replaces the File component in the Document QA flow with the Unstructured loader. The flow loads a file from your local machine and parses the loaded data into plain text. This loaded data then informs the OpenAI component’s responses.

Open Langflow and start a new project

  1. In the Astra Portal header, switch your active app from Astra DB to Langflow.

  2. In Langflow, click New Project, and then select the Document QA project. This opens a starter project with the necessary components to run a chatbot application using the File component.

  3. Replace the File component with the Unstructured loader component.

  4. Connect the Data output of the Unstructured loader to the Data input of the Parse Data component.

Unstructured Document QA flow

starter flow unstructured loader

The Unstructured Document QA flow consists of the following components:

  • The Unstructured component loads and parses PDFs, TXTs, and DOCX files with the Unstructured Serverless API.

  • The Parse Data component parses and converts data into plain text.

  • The Chat Input component accepts user input to the chat.

  • The Prompt component combines the user input with a user-defined prompt.

  • The OpenAI model component sends the user input and prompt to the OpenAI API and receives a response.

  • The Chat Output component prints the flow’s output to the chat.

Run the Unstructured Document QA flow

  1. Add your credentials to the OpenAI component. The fastest and most secure way to add credentials is with Langflow’s Global Variables.

    1. Click settings Settings, and then click language Global Variables.

    2. Click Add New.

    3. Name your variable. Paste your API key in the Value field.

    4. In the Apply To Fields field, select the field you want to globally apply this variable to.

    5. Click Save Variable.

  2. In the Chat Output component, click play_arrow Play to start the end-to-end application flow. A Chat Output built successfully message and a check Check on all components indicate that the flow ran successfully.

  3. Click Playground Playground to start a chat session.

  4. Ask a question about the document you uploaded in the Unstructured component. The OpenAI component will answer your question with information from the document. Now that your query has completed the journey from Chat Input to Chat Output, you have completed the Unstructured Document QA flow.

Next steps

To interact with this flow as an API endpoint, see Langflow API.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com